Chemical composition of rainwater harvested in East Malaysia
Article information
Abstract
As part of the implementation of a rainwater harvesting system as an alternative water source supply for non-potable use, therefore the characteristic of chemical compounds was significantly explored. The Department of Chemistry, Malaysia, gave the data set for three years (2017–2019). Some chemometric techniques, including PCA, were performed to identify the dimensionality of the rainwater data, hence establishing the rainfall index’s purity to determine the quality of rainwater in the study area. Discriminant analysis managed to differentiate each rain gauge station. Cluster analysis was then applied to perform smaller group of rain gauge stations. The result demonstrates that sea salt, secondary aerosols, trace metals, crustal origin, and organic acid dominated the dimensionality of rainwater data with a total variance of 53.38% and indicated that the PRI was significantly diversified into good purity of rainfall index (GPRI), (Labuan and Danum Valley), moderate purity of rainfall index (MPRI), (Kuching and Tawau) and bad purity of rainfall index (BPRI), (Kota Kinabalu and Bintulu). From the study, it can be stipulated that the chemical composition of rainwater in the study area was attributable to the local activities.
1. Introduction
1.1. Preliminary
Rainwater has been confirmed polluting in most regions of the world due to numerous pollutants loads in the atmosphere [1]. Understanding the chemical composition of rainwater is important to enable the investigation the atmospheric conditions of a region and the concentration of the soluble components that contribute to rainwater chemistry [2, 3]. The characteristic of rainwater depends on the atmospheric particulate or gaseous constituents produced locally or transported from distant sources by natural or anthropogenic sources [4]. Owing to the effect of local sources, the chemical composition of rainwater varies by geographical locations [5]. Emission of air contaminants is tremendously increased primarily from the anthropogenic activities in urban area [6, 7] as resulted from large populations, fast-growing economics, high energy use, agricultural development, and industrialization [6]. A study was done in a lofty industrial activity like New Castle, the harvested rainwater was found to be seriously contaminated mostly by trace metals [8]. Contrary to another study by Al-Khashman [3], which was carried out in Jordan, the concentration of trace metals was relatively low due to the local condition influenced by local anthropogenic sources.
The scavenging of air contaminants influences both the chemical composition and the pH of rainwater [9]. Acid precipitation is mostly caused by the anthropogenic sources such as sulfur oxides, and nitrogen oxides [6, 10]. However, the acidity of rainwater depends on the relative contribution and neutralization of the major acidic and alkaline ions in atmosphere [3]. It is important to determine the characteristics of ions in water, both ionic and cationic. Thus, both ionic and cationic has become the focus of extensive study since the last two decades because of environmental concerns [1, 9].
1.2. The Quality of Rainwater Harvesting
Historically, rainwater harvesting has been practiced traditionally in many dry regions in the world as early as 4,500 B.C [10, 11]. Currently, rainwater harvesting has been practiced in many countries for reducing reliance on the availability of domestic water from dams and reservoirs [12], supply non-potable water to building in urban areas [13], and solving water issues in semi-arid regions for agricultural and domestic uses [14]. Yet, some constraints should be pointed out before the harvested rainwater could be used, especially if it is intended to be used for potable uses [15]. Many international studies reported the quality of rainwater is contaminated with various microbial pollutants [16–18] and heavy metals originated from raindrops, catchment areas, and storage [19], which requires proper treatment.
The system of rainwater harvesting is collecting and storing rainwater that falls upon the roof surface for later use [20]. Three principal components are required to harvest the rainwater: namely the catchment area, the collection device, and the conveyance system [21]. Since the collected rainwater can supply the water with various economic and environmental benefits, it can significantly provide water usage for non-potable without requiring a drinking water quality standard [22]. The presence of pollutants might not be of significant concern, and the criteria for treatment may be less strict or not at all necessary [23]. The appropriate treatment of rainwater is needed if it is meant for drinking. The system such as the first flush is used to decrease the accumulation of pollutants [24], a post-cistern treatment proven to reduce the amount of total coliform [25] and slowing sand filtration to reduce the turbidity and heavy metal concentration [26, 27]. In a broader sense, rainwater harvesting is necessary to reduce urban water consumption, increasing its productive use [28]. Furthermore, rainwater harvesting can achieve sustainability and water conservation supplies around the world [13].
1.3. East Malaysia Climate and Rainwater Blessing
Malaysia is a developing country with continued economic development and population expansions [29, 30]. The rapid growth in Malaysia has led to several environmental problems, particularly the water stress problem [31, 32]. Therefore, the government has acknowledged the water scarcity problem occurs due to the increasing demand for water supply, water sources pollution, and encroachment into catchment [33]. With the blessing of abundance rainfall, there is a lot of potential that can be explored to meet the water shortage problem in Malaysia. The quality of rainwater is assessed in terms of its physical, chemical, and microbiological characteristics. The chemical content of harvested rainwater usually adheres to the WHO [34]. The rainwater samples were analyzed for total dissolved solids, conductivity, pH, major cations, and anions [35]. The composition of anions and cations were determined by ion chromatography, while pH and electrical conductivity were determined using pH meter and conductivity meter, respectively [36].
Generally, rainwater is relatively free from major pollutants in Malaysia [37]. According to Hashim et al. [38], the rainwater quality in Malaysia satisfies the drinking water standard. The finding is supported by Asman et al. [39]. Based on their study in Bangi; rainwater harvesting can be supplied for domestic purposes and as well as drinking water, but with additional treatment. Simultaneously, a study done in Sandakan stated that rainwater quality is always better than surface and groundwater quality [40]. However, in an urban area like Kuala Lumpur, rainwater’s pH value is slightly acidic [41, 42]. Thus, the use of rainwater for non-potable purposes is recommended [43].
The Ministry of Housing and Local Government (MHLG) officially launched the rainwater harvesting system in Malaysia after the drought event of 1998 [38]. The initial acceptance of the rainwater harvesting method is not strong enough. Therefore, the Ministry of Energy, Green Technology and Water (KTTHA) had introduced two new water laws: Water Services Industry Act 2006 and Water Services Commission Act 2006, which encourages the implementation of rainwater harvesting system [35]. In 2012, the Malaysian government imposed new rainwater harvesting system at large commercial and residential buildings, such as bungalows and semi-detached houses, to install a rainwater harvesting system under the Uniform Building By-Laws 1984 [44].
Different multivariate analyses have been used in modern research to analyze the chemical composition of water quality [45–48]. To date, limited studies were reported where multivariate statistical techniques were applied in rainwater study. Thus, this study proposed a standard multivariate analysis method, including principal component analysis, discriminant analysis, and cluster analysis to precisely investigate the characteristics of rainwater in the study area. East Malaysia has recently been selected for in depth research to study the composition of rainwater’s chemical characteristics as an alternative water resource, particularly for non-potable uses due to its water shortage problems [40]. Therefore, it is necessary to analyze the characteristics of the rainwater and provide a reliable assessment of rainwater quality for the consumption of humans. The research however was to investigate the quality of rainwater in the study area for the potential use of non-potable benefits. Besides, it is believed that rainwater harvesting will be one of the solutions to overcome the water shortage problem worldwide.
2. Research Methodology
2.1. Geography and Rain Gauge Stations
East Malaysia comprises of two largest states, Sabah and Sarawak that geographically lies on the northwestern coast of Borneo Island. The state of Sabah is covering the area of 73,856 km2, while the state of Sarawak with the maximum area of 124,989 km2 sharing the border of Kalimantan, Indonesia. On the northern part of Sarawak State has land frontiers with two enclaves which make up Brunei (Fig. 1). East Malaysia experiences a wet and humid tropical climate with rainfall season virtually throughout the year with the average rainfall of 5,080 mm annually [49]. The monthly cumulative distribution of rainfall is influenced by the seasonal monsoons, namely Northeast monsoon (October to March) and the southwest monsoon (April to September) [50]. The northeast monsoon is the primary rainy season in Malaysia, produces heavy rainfall in Sarawak. The southwest monsoon is relatively drier except in Sabah. The rain events consist of convective and widespread rain. Convective rain is characterized by the intense rainfall over a short period and covers a limited area [51]. The quantities of rainfalls are changeable depending on the seasons of the year. There are 504 rain gauge stations scattered around Bintulu, Kota Kinabalu, Kuching, Labuan, Danum Valley, and Tawau.
2.2. Data Collection
In this study, the hourly rainfall of secondary data was sourced from the Department of Chemistry Malaysia. All stations were identified based on data available starting from January 1, 2017, to December 31, 2019. The chemical variables detected in this study were ammonium (NH4+), calcium (Ca2+), fluoride (F−), magnesium (Mg+), potassium (K), sodium (Na+), nitrate (NO−3), sulfate (SO4−2), acetate (C2h3O2), chloride (Cl−), formate (CHO2−), methane sulfonic acid (CH4O3S), oxalate (C2O4(2−)), copper (Cu), iron (Fe), manganese (Mn), mercury (Hg), nickel (Ni), cadmium (Cd), conductivity (EC), lead (Pb), pH and zinc (Zn). The focus of this study only on statistical analysis via chemometric approaches. All mathematical and statistical computations were performed using Excel 2013 (Microsoft Office). Principal component analysis (PCA), discriminant analysis (DA), and Hierarchical algorithm cluster analysis (HACA) were performed via XL STAT add inn software.
2.3. Principal Component Analysis
PCA is a powerful supervised pattern recognition technique used to explain the variance in a dataset of inter-correlated variables. It is a technique applied to reduce the original variables into a smaller number called principal components (PCs), accounting for most of the variance in the observed variables [52]. The PC can be expressed as:
Where ȥ is the component score, a is the component loading, x is the measured value of the variable, i is the component number, j is the sample number, and m is the total number of variables. PCA reduces the dimensionality of a data set consisting of a large interrelated variable while remaining as much as possible the variability present in the data set [53]. The number of components to keep was based on the Kaiser criterion, for which only the components with an eigenvalue greater than one are retained [54]. The eigenvalues of the PCs are a measure of their associated variance, the loadings give the participants of the original variables in the PCs, and the individual transformed observations called scores. [55] classified the factor loadings as ‘strong,’ ‘moderate,’ and ‘weak,’ corresponding to absolute loading values of > 0.75, 0.75–0.5, and 0.50–0.30. After applying PCA, varimax rotation based on factor loading was conducted, allowing the ‘cleaning up’ of the PCs by increasing the participation of the variables with a higher contribution and simultaneously reducing the variables with a lower contribution [56]. In this study, PCA was performed on the dataset having the dimensions of 504 objects and 23 physicochemical and metal parameters to define many variables into a smaller set. Hence, identify the latent factor which contributes to the existing these parameters in the rainfall data.
2.4. Developing Purity of Rainfall Index
Using the value of factor scores generated via PCA, each factor can be viewed as one aspect of rainwater. Therefore, factor scores can be used as a single index indicating the aspect with which the factor associated. PRI was a composite of different variables. It was developed by weighting each factor score with the respective variance using the equation below:
Where n is the number of factors selected, FI is factor i score and wi is the percentage of the variance factor of i. The weightage (summation of the factor score loadings multiplied with variability) gave the insignificant values (comprising of negative and positive values) in which impossible to achieve the desired objective or targeted index. Subsequently, the normalization or rescaling was performed using the Eq. (3) to z-values (such that the variance for each variable would equal unity), rescaling the index through weightage to scale the range from 1 to 100 (statistical rules) using the Univariate analysis to obtain the accurate index value.
Where a equal to 1, xi is the actual observation, A and B respectively, are the lowest and highest factor score, and b is the constant value of 100. In this study, univariate clustering was used to execute PRI divided into three categories viz named Good, Moderate, and Bad. The lowest PRI value indicates that the characteristics of the rainwater at the rain monitoring stations were not much affected by the contaminants. Therefore, the areas were compatible to be considered as good rainwater quality.
2.5. Discriminant Analysis
Discriminant analysis is a supervised pattern recognition technique used to determine the variables responsible for separating the observations into different groups [57, 58]. Linear combinations of the independent variables found through this technique will discriminate against the groups so that the misclassification error rates are minimized. Discriminant analysis was performed on original data without affecting the results and comparability with other multivariate methods and constructed a discriminant function (DFs) for each group as follows:
where i represents the number of groups (G), ki is the constant inherent to each group, n is the number of parameters used to classify a set of data into a given group and wj is the weight coefficient assigned by DF analysis (DFA) to a given parameter (pj) [59]. In this study, the purity of the rainfall index executed by PCA was treated as the dependent variable, while the 23 physicochemical and metal parameters were treated as independent variables. DA was performed on the original data via standard, forward stepwise and backward stepwise modes, to construct the best DFs to confirm the indices developed by PCA.
2.6. Hierarchical Algorithm Cluster Analysis
Cluster analysis is an assortment of techniques designed to perform classification by assigning observation to the group, so each is more or less homogeneous and distinct from other groups. Hierarchical algorithm cluster analysis (HACA) is the most common approach and provides intuitive similarity between anyone sample and the entire data set [60–62]. HACA is an unsupervised pattern recognition technique used to identify the natural grouping pattern and group variables without making any prior assumption. The objects with high similarity are clustered into the same group while the objects with high differences are clustered in different groups. The result of HACA illustrated using a dendogram through Ward’s method using the squared Euclidean distance. The Ward’s method, using squared Euclidean distances as a measure of similarity, possesses a small space distorting effect, uses more information about cluster contents than other methods, and has been proved to be an extremely powerful grouping mechanism [56]. The dendrogram provides a visual summary of the clustering process, presenting a picture of the groups and their proximity, with a dramatic reduction in the dimensionality of the original data [63]. In this study, HACA was performed on the rainfall data set by means of Ward’s method, using Euclidean distance.
3. Presentation of Findings and Discussion
3.1. Atmospheric Particulate Matter in Rainwater
Summary statistics for chemical composition in the study area are presented in Table 1. Most parameters displayed a wide variation in elemental concentration, as reflected by large standard deviations. Chemical parameters like ammonium, potassium, sodium, nitrate, sulfate, and chloride had the greatest standard deviation values in the data set, as shown in the table, and were 39.80, 10.24, 28.00, 21.70, 10.60, and 33.78 mg/L, respectively. The highest variance of ammonium within the study area could have resulted from the value of ammonium concentration in each rain gauge station is far from each other which means the concentration at each rain gauge station very distinctive due to the locations. The highest concentration of variance was due to the local activities at each rain gauge station. The agricultural areas such as Tawau, was considerably emitted higher ammonia concentration into the atmosphere than the industrial areas like Kuching and Bintulu [64]. The emission of ammonia to the atmosphere associated with livestock production [65].
Among the inorganic species, chloride was the most abundant species by mass, followed by sodium, ammonium, nitrate, and sulfate. The high concentration of chloride ion is acquired from the large bodies of the South China Sea. While among the metals, Zn was the most abundant by mass than the other metals with mean value of 0.30 mg/L. This study further suggested the potential factor of zinc emission in the rainwater subjected to the roofing material [13]. Zinc-based roofing material undergoing the chemical reaction with the presence of oxygen in atmosphere to form zinc oxide, and zinc hydroxide if expose to moisture. The pH values of collected rainwater samples ranging from 4.3–7.0, thus it noticeable the rainwater within the study area were acidic. The acidity level increased dramatically as the anions level in the atmosphere increased, thus lower the pH values [66]. However, the optimum value of the pH level in the study area seems to be neutralized in some rain gauge stations by alkaline ions (NH4+, Ca2+ Mg+, Na+) [67].
3.2. Chemical Composition of Rainwater
Based on the chi-square value, it was calculated as 291.10 from Bartlett’s sphericity test (d.f. = 253 p < 0.0,001), and the Kaiser-Meyer-Olkin (KMO) test, the sampling adequacy was greater than 0.5. The output revealed, PCA was significant in dimensional reduction of the complex rainfall dataset, thus subjected for further analysis [68]. The PCA with varimax rotation explained that out of 23 principal components, only eight PCs with an eigenvalue greater than 1.0 (Table 2), with the total explained variance of 67.62% was considered for further analysis. The most significant PCs were the first five, with a cumulative explanation of 53.38% of data variability (variance for PC1 = 23.25%, PC2 = 9.39%, PC3 = 8.68%, PC4 = 7.02% and PC5 = 5.04%). The remainder PCs, with a variance of 4.98%, 4.69% and 4.56%, respectively, did not reveal any significant similarities among the rain gauge stations.
The factor loadings provide a correlation between the chemical composition with the factors (Table 3). The varimax factor (VF) loading plot (Fig. 2) revealed the strong correlation and dependence between the chemical composition in rainwater (shorter distance corresponding to a stronger correlation between the parameters) [68]. The first factor accounted for 23.25% of the total variance and had a strong positive correlation with sea salt ions (Mg+, Na+, and Cl−) and moderate positive loading with heavy metal (Ni). Mg+, Na+, and Cl− are the major ions in aqueous solution. The strong correlation among these ions explains it came from the same sources such as marine aerosols [69].
In this case, Ni loading (0.51) was not as high as the loadings of the other elements of the group, which may therefore, implied that Ni has independent behavior within the group. Ni is one of the trace metals released from both natural sources and anthropogenic activities [70]. Nickel presents in the air as a result of the industrial activities including combustion of coal, diesel oil, fuel oil, the incineration of waste and sewage, stainless steel production, petrochemical plant, brick manufacture, and heavy ship traffic [71]. Despite nickel moderately loaded on this factor, the presence of this stand-alone metal without any toxic metals loaded in the third factor may explain, the abundance of nickel in the study area expected was emitted from multiple sources. In the study area, Nickel could possibly contributed from natural sources such as vegetation and sea salt [72], while the anthropogenic sources, including both stationary and mobile sources [73]. Therefore, this factor shows the origins of sea salt in this study derived from the sea salt spray and industrial activity.
The second factor consisting of high positive loading on potassium and electrical conductivity, and moderate positive loading on ammonium, nitrate, and sulfate, explained 9.39% of the variance. The average positive loading of three inorganic ions (ammonium, nitrate, and sulfate) suggests that the precursors of these ions were released from similar emission sources such as coal burning, vehicle exhaust or industrial sector besides meteorological conditions and internal mixing [74]. However, ammonium shows higher loadings compared to the nitrate and sulfate. It was possible that ammonium might have resulted from the reaction of other particles in the airborne and played a significant role as a neutralizing agent [75]. Ammonium, sulfate, and nitrate are secondary inorganic aerosol, they were formed by physical or chemical reactions of precursor gases, such as sulfuric acid, nitric acid, and ammonia [76]. Sulfuric and nitric acids are atmospheric oxidation products of gaseous sulfur dioxide and nitrogen oxides, respectively. The strong correlation among nitrate coupled with sulfate, produced as secondary aerosols, originated from similar sources of major industrial sources, including the power industry, biomass industry, vehicular emission, and residential [77, 78]. However, the modest correlation between potassium with nitrate and sulfate indicating the formation of nitrate and sulfate contributed from biomass burning aerosols [79].
The generation ammonia was probable the contribution from the agricultural practices such as livestock emission, fertilizer use, and agricultural waste burning [80]. The reaction of ammonia and oxygen in the air, it would have undergone for conversion to ammonium aerosol, subjected to the acid concentration in the air. The presence of ammonia in the air led to the production and stabilization of particulate sulfate and nitrate [81]. Enough gaseous ammonia (NH3) may react with sulfate and nitrate to form ammonium sulfate (NH4)2SO4 and ammonium nitrate (NH4NO3) via particle gas formation and gas to particle conversion [76].
Conversely, potassium did not show a significant correlation with these secondary inorganic ions. Though, it was the highest loading in these factors compared to the other three ions. It may imply that this ion contributed from multiple sources such as biomass combustion, biogenic process, sea salt, and soil dust [82, 83]. The prior study reported the presence of potassium in the atmosphere generated from biomass burning [84, 85]. Since EC is strongly associated with the degree of acidity of the rainwater, this factor explains the loading of this parameter. The high concentration of inorganic particles in the rainwater constitutes a higher level of salinity, thus increased the electrical conductivity of the rainwater [55]. Factor two explains the sources of these chemical compounds originated from the industrial activities.
The third factor acknowledged trace metals copper (Cu), cadmium (Cd), lead or plumbum (Pb), and zinc (Zn) with positive loading, 8.68% of the total variance. A significant positive correlation among these metals, suggesting a possible common origin of these metals came from similar sources. This factor should be emissions from various sources, considered contributed by the vehicular exhaust, industrial activities, oil and waste combustion in incinerator and coal burning in electricity generation plants [86, 87]. Yet, this factor shows there were two parts of toxic trace metals load. Cu, coupled with Pb, has strong loading, while, Zn, coupled with Cd, has moderately positive loading. Cu and Pb were significantly correlated based on the loading value in this factor and maybe originated mainly from industrial sources and traffic pollution [88]. Cu can be emitted from the linings of a vehicle, especially during congested traffic conditions, while Pb comes from the usage of leaded gasoline and batteries [89].
While both Cd and Zn have moderate positive loading in this factor, it was generated from a similar source, the possible sources are predominantly the earth’s crust, as well as the anthropogenic sources [90]. The primary anthropogenic sources for these Cd and Zn such as fuel and coal combustion, vehicular and industrial emission [91]. Cd was most likely from oil leakage from automobiles along with car abrasion and car lubricant [92]. Zn might produce from the serious wear and tear of tires and the brake lining [90, 93]. Many studies reported zinc present in the rainwater due to the roofing material [94–96]. This factor further exhibits the main sources of trace metals in this study was emitted from the anthropogenic activities.
Formate, methasulfonic acid, and oxalate were the significant factor loadings in the fourth factor. It was known as organic acid, and it explained 7.02% of the variance. These organic acid loaded on the same factor may explain that it was originated from marine aerosol [97]. The high loading of oxalate (0.80) in this factor might be experienced by forming oxalate via in-situ photochemical reactions and their precursor, such as fatty acid [98]. In line with past studies, the oxalate was the most abundant water-soluble species detected in the marine atmosphere [99, 100]. As opposed to oxalate, it originated from multiple sources, and methane sulfonic acid (MSA) was only oxidized from DMS in the air [101]. DMS was a primary source of biogenic sulfur, emitted in a large amount into the marine atmosphere [102]. MSA was detected in the atmosphere over coastal and regions [103]. While the sources of formate in this study possibly mainly from the daytime of photochemical reactions [104]. The result explained the moderate positive loading of MSA couple with formate in this factor, which was not high as oxalate. This factor concludes the sources of organic acid mainly originated from the photochemical process.
The fifth factor exhibited the strong positive loading only for manganese with 5.04% of the total variance. The emission of Mn into the atmosphere was mostly a mixture of natural crustal materials and anthropogenic compounds produced by traffic and industry [105]. This study shows Mn was not significantly correlated with other metals. It was probable that Mn originated mainly from the natural sources, particularly from soil [106]. However, this study also revealed that the traffic density was also contributed to the sources of Mn [107]. The previous studies stated that the concentration of Mn in the atmosphere resulting from the emission of dust in by vehicular emission [92]. Relatively, the high loading of Mn alone in this factor is interpreted to be the result of natural enrichment by weathering and pedogenesis [108].
The sixth factor accumulated carboxylic acid (acetate and formate), accounted for around 4.98% of the variability. It was distinguished with high loading from acetate (0.88) and moderate loading on formate (0.54). Therefore, this factor is related to a wide variety of sources, including direct emissions from biomass burning, vegetation, vehicular emissions, and urban and coastal areas [109]. The previous study reported these carboxylic acids were the most abundant monocarboxylic acids in dust particles [110]. The higher loading of acetate compared to formate in this factor could have mainly from primary sources such as industrial and vehicular emission [104], compared to formate, possibly dominated by photochemical production [110].
The seventh factor explains 4.69% of the variance consist of negative loads of iron but overloaded with pH. Simultaneously, the eighth factor accounted for 4.56% of the total variance with positive loadings on fluoride and nitrate. Considering the value of variance for factor 6 to factor 8 less than 5.0%, this study concludes that the underlying construct of rainfall data in Eastern Malaysia was associated with five, rather than eight, factors accounting for approximately 53.38% of the variations in the dataset. Previous studies supported this finding that the variance with less than 5.0% was not significant [111]. It was associated with the criteria of factor extraction [112].
Consequently, this study did not detail out the discussion for the seventh and eighth factors. Therefore, the variables of iron and pH (factor 7) and fluoride (factor 8) were not considerably significant in this study. Though nitrate was loaded in factor 8, but it still needs to be accounted for since it falls in factor two. Based on the result of PCA, most factors in this study were dominated by water-soluble ions. These ionic species are the major components of atmospheric aerosol and are associated with acidification of rainwater [113]. The presence of these mineral components can be attributed to both natural and anthropogenic processes [82].
3.3. Discrimination of Spatial Variation
In this study, DA was applied to investigate the spatial variation of rainwater characteristics between the rain gauge stations. In this study, there are three spatial groups 1, 2, and 3 represented good purity rainfall index (GPRI), moderate purity rainfall index (MPRI), and bad purity rainfall index (BPRI), respectively were treated as dependent variables. In contrast, 23 physicochemical and metal parameters were treated as independent variables. DA was performed via standard, forward, and backward stepwise modes. The accuracy of the spatial classifications through three (3) modes; standard, forward, and backward stepwise gave the values of 94.25% with 20, 15, and 16 discriminant variables. The result of forward stepwise mode gave the highest correct classification of 94.84%, discriminating three spatial classes assignations with 15 variables, magnesium, sulfate, formate, lead, zinc, nitrate, calcium, sodium, oxalate, acetate, ammonium, manganese, meth sulfonic acid, cadmium, and pH (Table 4). Considering the high value of the correct classification matrix, the DA has substantially proven supporting the result of in confirming the PRI for each rain gauge station.
The Wilk’s lambda test for the forward stepwise mode gave a value of 0.10 (p<0.0001) and shows significant differences between these three groups (Fig. 1). The null hypothesis states, the mean vectors of the three classes (GPRI, MPRI, and BPRI) are equal. The alternative hypothesis states at least one of the mean vectors is different from another. Since the computed p-value is lower than the significance level alpha = 0.05, one should reject the null hypothesis and accept the alternative hypothesis. The risk of rejecting the null hypothesis when it is true is lower than 0.01%. Thus, the three groups are indeed different from one another. Fig. 3 shows that the observations on the factor axes confirm that the rain gauge stations are very well differentiated on the factor axes extracted from the original explanatory variables. The spatial variation of rainwater characteristics results showed that discriminant analysis has successfully grouped the rain gauge stations, according to their chemical compounds in the rainwater.
3.4. Spatial Similarities and Site Grouping in Different Regions
HACA has been proved useful in solving classification problems where the object is the sort of variables into groups such that the degree association is strong among the members in the same class and weak among the members of different classes. In this study, HACA was performed on 23 physicochemical and metal parameters to study the spatial variations of the rain gauge stations based on their similarity levels. The result was presented in the dendrogram, as shown in Fig. 4. Six cities covering 504 stations in Labuan, Danum Valley, Kuching, Tawau, Kota Kinabalu, and Bintulu were diversified into three clusters (Fig. 2). Cluster 1 was formed Labuan and Danum Valley, named as good purity of rainfall (GPRI). Cluster 2 accommodates the urban areas namely, Kuching and Tawau as moderate purity of rainfall (MPRI). While cluster 3 formed bad purity of rainfall (BPRI) for the areas of Kota Kinabalu and Bintulu.
3.4.1. Good purity of rainfall index (Labuan and Danum Valley)
Cluster 1 corresponds to category of good purity rainfall index (GPRI) and is characterized by the biggest Euclidean distance than the other two clusters. Labuan and Danum Valley fall under GPRI. The location of rain gauge stations in these districts, mostly located in rural areas, covered with forest reserves, or protected areas such as Danum Valley. Danum Valley, situated downstream of Sabah, was far from the major point and non-point source pollution in the Sabah region. Danum valley is a forest conservation area in Sabah that covers approximately 438 km2 of protected unlogged forests and is surrounded by an extensive area of production forest [114]. It is the largest remaining stand of lowland dipterocarp rain forest in Sabah with limited human settlement, least human impacts on soil and water [115]. On the site, Labuan Island, is a federal territory of Sabah, (off the coast of the state of Sabah) was located at the northwestern tip of Borneo Island [116]. Although known as being offshore support for the deep-water oil and gas industry, some small industrial area, and tourism activity [117], the result showed that Labuan had similar rainwater characteristics with Danum Valley. The minimal impact from the urbanization area, which affected by industrialization activities not contributing much to the rainwater behavior in these two districts.
3.4.2. Moderate purity of rainfall (Kuching and Tawau)
Cluster 2 corresponds to MPRI (including Kuching and Tawau) shows both regions located nearby to the coastal, industrial, and residential areas subject to the similarity amongst the rain gauge stations. Kuching is the capital of the state of Sarawak, located in the northwestern part of Borneo. Meanwhile, Tawau is located on the Southeast coast of northern Borneo, which overlooking the Celebes Sea to the east [118]. There was substantial of low pH level detected at these districts, because of close proximity to the coastal area, with the relatively great influence of marine aerosols on the particle concentration [119]. The result has strengthened the findings of the dominance of sea salt and organic acid species in the atmosphere. Conversely, the concentration of trace metals found in this cluster to be moderately high contributed from the anthropogenic activities and natural process (river discharge, hydrothermal-circulation) as Kuching and Tawau are the major urban centre in Sarawak and Sabah, respectively [120]. The industrial areas such as Kuching City responsible for the emission of trace metals at large. The combustion of fuels from the automobiles and aircraft has become another factor of heavy metals emission in this cluster. The power plant stations located in Kuching and Tawau are the major supplier of electricity used diesel and coal-fired are the important sources of distribution and emission of hazardous trace including metal in the atmospheric environment [121]. Moreover, the precursor gases of secondary aerosols found to be predominantly originated from burning fossil fuels and motor vehicles [122]. Kuching and Tawau, both cities have good infrastructures, and domestic river port, of which enhanced the congested land and water transportation leading to the vehicle emission particles to the atmosphere [123].
3.4.3. Bad purity of rainfall (Kota Kinabalu and Bintulu)
Cluster 3 corresponds under category bad purity index (BPRI) that consists of Kota Kinabalu (Sabah) and Bintulu (Sarawak). The rain gauge stations in cluster 3 are placed in the city centers of Kota Kinabalu and Bintulu. These stations are known as major urban areas, surrounded by a residential area, heavy traffic congestion around the clock, airport, and industrial activities [117]. Kota Kinabalu is the capital of Sabah state, located in the north of Borneo, and it is the busiest urban center in Sabah [118]. The abundance of sea salt particles (Mg+, Na, and Cl−) and organic acids (formate, methane sulfonic acid, and oxalate) recorded at both districts are expected due to geographical position, located near to the coastal area of which confronting the South China Sea. Lewandowska & Falkowska [124] agreed that the evaporation of sea surface sprays emitted a large amount of sea salt to the atmosphere.
The major source of heavy metals within these two areas was primarily from coal combustion and industrial emission in the region [91]. Since Kota Kinabalu and Bintulu have several stationary sources linked to the industrial such as power stations, industrial fuel combustion, and domestic fuel combustion, therefore it can be concluded that the large emissions of the precursors of inorganic ions (ammonium, nitrate, and sulfate), have been released from the industrial activities. Besides, the port activities like shipping transportation, the airport also contributes to the emission of these precursor gases. The fuel oil power plant located in this industrial area was also an essential contributor to Ni particulate levels observed at the rain gauge stations. Both Kota Kinabalu and Bintulu have some power plant stations to generate electricity in Sabah and Sarawak. Trace metals (Cu, Cd, Pb, and Zn) are highly correlated in factor 3 (trace metal), usually associated with different sources nominally coal combustion, incineration, and traffic (mainly tires and brake wear). The busiest traffic road congested with automobiles and shipping in Kota Kinabalu and Bintulu contributed to releasing trace metals into the surrounding atmosphere. The combustion of fuels from vehicular, aircraft, and the burning of fuels in industrial activities were also suspected to be the source of heavy metals. A study by Johansson et al. [125] reported that emissions from road traffic occur at ground level in the most densely populated part of the districts were the major sources of heavy metals (Cu, Cd, Pb, Ni, Mn, and Zn). This study, in tandem with Furusjö et al. [126] and Wahlin et al. [127] observed that brake wear was the major source of heavy metal in Sweden and Copenhagen, respectively. Abbasi & Abbasi [128] agreed that the concentration of heavy metals is highest in the urban and industrial areas. Besides, as the location of study area is in the coastal zone, therefore the pollution problems arises due to urbanization, and an increase in the population. In subsequent, the deterioration of the rainwater quality occurred.
From the HACA result, all districts located nearby to the coastal area except Danum Valley, it can be concluded that most of the rainwater in the study area was mostly impacted by ionic particles distributed within the marine atmospheres. Therefore, these ions did not be the primary cause of the variation of rainwater characteristics in the study area. However, the rain gauge stations in the urbanized area (surrounded by the industrial zone) received abundant heavy metal particles and precursor gases from various sources. This scenario has changed the variations of rainwater behaviour tremendously. The results indicate that HACA is useful in offering reliable classification of rainwater in the study area. Additionally, this study confirms that it is conceivable to plan an optimal monitoring strategy with fewer rain gauge stations and cost-effective [129].
4. Concluding Remarks
In East Malaysia, rainwater acidity was associated with the Mg+, Na+, and Cl− signifies that these ions mostly originated from marine aerosols. However, SO42−, NO3− and NH4+ are the secondary aerosols formed via the atmosphere’s chemical reaction and mainly originated from anthropogenic activities. The alkaline ions such as Mg+, Na+, NH4+ suggested were neutralized the acidity of rainwater. Factor analysis shows that the anthropogenic activities contributing to the emission of toxic metals (8.68%) did not significantly impact the rainwater quality. The anion and cation balance varied among the rain gauge stations based on the local activities. The result shows that, with a total variance of 53.38%, sea salt, secondary aerosols, trace metals, crustal origin, and organic acid dominated. The establishment of the purity of rainfall index generated good, moderate, and bad index categories will be the indicator to determine the quality of harvested rainwater quality in East Malaysia. With the right classification value of 94.84%, only 15 variables, including magnesium, sulphate, formate, lead, zinc, nitrate, calcium, sodium, oxalate, acetate, ammonium, manganese, methasulfonic acid, cadmium, and pH, were used to effectively differentiate between three spatial groups (GPRI, MPRI and BPRI). This study suggests that an optimum monitoring approach with fewer rain gauge stations and cost-effective monitoring is conceivable.
Acknowledgment
This study is a part of the Special Research Grant Scheme UniSZA/2017/SRGS/18. The authors would like to express their appreciation to the Department of Chemistry Malaysia and Meteorology Department, Malaysia, for the supply of data required for the completion of this study. The author wishes to thank the East Coast Environmental Research Institute (ESERI) UniSZA, for the use of their research facilities.
Notes
Author Contributions
A.S.N.F. (Ph.D) wrote the manuscript. I.A. (Senior lecturer) revised the manuscript. J.H. (Professor) assisted the statistical analysis. A.R.B. (Professor) revised the manuscript. L.F. (Senior lecturer) revised the manuscript. H.N.M. (Master’s student) editing the manuscript. A.N. (Master’s student) editing the manuscript. Z.M.A. (Chemist) assisted the statistical analysis. M.T.A.T. (Lecturer) editing the manuscript. H.M.H.F. (Lecturer) editing the manuscript. M.R.I.S.R. (Master’s student) editing the manuscript. J.J.R.A. (Director) conducted final editing the manuscript. D.S.M. (Director) conducted final editing the manuscript.