Early Smoke Detection on Video Using Wavelet Energy

Most of the smoke detection system these days still using sensors that have to receive specific particles before it could give a warning. But, this system takes some time to react and quite difficult to place in spacious room or the outdoor. To overcome this, there is some research that build smoke detection system using many kind video processing technique that could provide early warning. In this research, wavelet energy was used to detect smoke in the video. To determine candidate blocks in a frame that contain smoke, this research performed background subtraction and color analysis based on HSV color space. Then implementing spatial analysis and spatio-temporal analysis by using wavelet energy method and accumulative motion orientation to detect the smoke. This system using combination of dataset from previous research [1], downloaded from various sources and selfmade dataset. Based on testing process using those dataset, this system reaches 91.05% accuracy for block-level and 72.22% accuracy for frame-level.


A. INTRODUCTION
MOKE detectors are widely used nowadays in various places.It is usually a sensor-based that need to capture certain particles before giving a warning of smoke presence.However, such kind of detectors need some time and need to be placed in a certain position so that smoke can reach the sensor.
Several studies have been developed to replace smoke sensors, one of them is by using video.Video-based smoke detection system can be promising, given the smoke arises before a fire.The source of smoke may be out of range of the video or hindered by other objects so it cannot be captured by video, hence the smoke detection system capable to warn earlier than the fire detection system if dangerous things happen such as wildfire.Furthermore, video can also be used to cover large or open areas and this system only need to be installed on video surveillance system that is widely used.
In developing smoke detection systems using video, there are some challenges such as variations of the shape, motion, and texture of the smoke that can be affected by background and luminance levels in the video [2].In a previous study, Gomez-Rodriguez et al. [3] applying wavelet and optical flow methods to detect smoke presence.Meanwhile, Toreyin et al. in [4] introducing the wavelet coefficient method by carrying out temporal and spatial wavelet analysis.Research by Avgerinakis et al. [2] implemented a smoke detection algorithm using temporal HOGHOF descriptors and energy color statistics.And in a recent study by Dimitropoulos et al. [1] [5] uses spatio-temporal wavelet analysis, HOGHOF descriptors, and linear dynamical system to detect smoke with a high accuracy.This mentioned studies, use wavelet energy method because smoke generally will smoothen the edges in an image if it is not that thick [4] [6].This feature of smoke then used as an indicator of its presence in video [4] [6].In the wavelet domain, extreme points represented by edges in an image [4] [6].If the smoke cover the edges, it will make the extreme point of the edges being reduced even if it does not reach zero.In fact, the edge remains in its position but will lose some of their wavelet energy because it is covered by smoke which is a semi-transparent object [4] [6].
In this research, spatial and spatio-temporal smoke analysis using wavelet energy method are implemented to obtain energy of smoke, then accumulative motion orientation method are used to determine smoke motion orientation.Both of these process are combined to distinguish smoke from non-smoke or smoke-colored objects and then can be used to determine the smoke presence in a video.

A. HSV Color Model
Every color in the real world are represented to computer by color space [7].HSV color model uses three components to represent the colors, Hue (H), Saturation (S), and Value (V).HSV color model was developed to resemble how human perceive colors.Beside that, the HSV color wheel is more often to be used for generating high-quality graphics because it is easier to choose the color required [8].Hue represents the true color that has a range of values between 0 degrees to 360 degrees or can be recalculated to 0 to 100.Hue degrees will represent a color, starting from 0 degrees is red, starting from 60 degrees is yellow, starting from 120 degrees is green, starting from 180 degrees is cyan, starting from 240 Degrees are blue, and starting from 300 is magenta [8].Saturation shows the level of clarity of a color that has a range of values from 0 to 100 or can be recalculated to 0 to 1.A color will look more faded when closer to 0 and will look more saturated when close to 100.Value indicates the level of brightness of color that has a range of values from 0 to 100 or can be recalculated to 0 to 1.The value 0 will represent the black color and will become brighter until white at 100 [9].

B. Frame Differencing
Frame Differencing is one of the background subtraction technique that aims to separate the background from foreground.Furthermore, it is also can be used to detect moving objects.Simplicity of this process that might result in short processing time is the advantage of this method [10].The following figure shows an image or frame in a video with smoke along with the marker of the image showing the background marked in black and the foreground marked in white.
Frame differencing techniques is implemented by differencing two frames, frame at time t-Δt and frame at time t.Frame differencing formula is defined as [11]. (1) Where: = binary image result of frame differencing x = row number y = column number = frame at time t = frame at time The difference value in each pixel of both frames will be measured.If the difference is greater than the threshold, it will be marked by 1, whereas if the difference is smaller than the threshold will marked by 0 in the binary image.This binary image will be used as a marker to show the foreground and background of the current frame.

C. Spatial Wavelet Analysis
Wavelet is a wave that oscillate from zero, increases, then decreases back to zero [12].Wavelet value are used in transformation functions called wavelet transforms.In the field of image processing, wavelet transform is more widely used than Fourier transforms.This is because the wavelet transform is capable of providing the time when it is transformed into the frequency domain [12].In its application to the smoke detection system, the wavelet transform is utilized to distinguish noise or edge signals and it is the one of the main reason wavelet are used in this system.An image containing smoke generally has a lower spatial value than an image containing a smoke-colored object with a lot of textures or shades, but it will have a higher spatial value than an image containing solid smoke-colored object.Therefore, spatial values are used to extract the characteristics of a smoke.To obtain the spatial value, the 2-D wavelet analysis algorithm is applied and implemented for white to gray image region, the spatial wavelet energy is determined by calculating the change of frequency of high-low, low-high, and high-high with the following formula The spatial-wavelet energy of each candidate block is estimated as the average of the energy of each pixel in the block [1]. (3) Where: = number of row or column in a block

D. Spatio-temporal Wavelet Analysis
The shape of smoke changes irregularly due to the airflow [1].Smoke generally have a higher spatial variations compared to smoke-colored objects over a period of time [1] [13].This process aims to measure spatio-temporal variations in the candidate block in a sequence of frames that contrast to the previous process which only counting spatial energy in a single frame [1].
Each directions are calculated to estimate the motion orientation of a block.Based on previous research [11], three pixel displacements are used as the searching scheme, as shown in Figure 5.To estimate the motion orientation of a block, the difference of the grayscale block values are calculated in the current frame and the previous frame using error function follows [11]: For each direction, there will be three error values based on the displacement.The minimum error value will be defined as the error of the direction [11].
(5) And the minimum error value of this direction is supposedly to be the motion orientation of the block, therefore the direction will be [11].(6) But based on [11], the motion orientation of a block is the contrary of the real motion direction, therefore the direction code is modified by [11].(7) Figure 5. Searching scheme [11] Figure 6.Smoke direction [11] FIRST AUTHOR et al.: And if a block have no motion then the motion orientation will be coded as 0. To improve the accuracy of the estimated orientation, then the motion orientation of a block are accumulated over time t.The direction with maximal entry is considered as the motion orientation of the block.

III. RESEARCH METHOD
In general, the process in the video smoke detection system of this research is shown in the diagram below.
The diagram above is the general overview of the smoke detection system through five phases, preprocessing, color analysis, spatial analysis, spatio-temporal analysis, and smoke motion analysis.The following is the detailed explanation of these processes.

A. Video Preprocessing
Before performing the smoke feature extraction on a video, each video will be processed in the preprocessing phase.Here is the preprocessing flowchart of this process.
Preprocessing aims to reduce the computational cost of the system in processing video when performing feature extraction.The preprocessing phase consists of two processes, split frame and background subtraction.The processed frame will be cut into blocks of 16 x 16 pixels.The purpose of this process is to reduce the region of interest and reduce computational cost.
In background subtraction, frame differencing technique is used to compare two blocks, the block in the current frame and the block in the previous frame.The difference of both frames are calculated using equation (1).If the difference value of a pixel is greater than the threshold, it will be defined as a moving pixel and if smaller then it will be defined as a stationary pixel.The result of the frame differencing is a binary block.After that, the current blocks will be processed again through morphological operations which is erosion and dilation to reduce noise pixels.A block will be considered as a moving block if there is at least one moving pixel in the block.And then this moving blocks will become the input of the feature extraction processes.

B. Color Analysis
The next phase is the color analysis.From the previous process, the moving blocks has been obtained that become input for this process.At this process, the block containing smoke color will be separated from the block that have no smoke color.
Block input of the previous process are using RGB color model, while in this process required block input in HSV color model, therefore the block need to be converted into HSV using the following equation [1]: (8) After the conversion process, the value of Saturation and Value of each pixel in the block is examined.A pixel will be considered as smoke if: (10) Where the values of the thresholds for Saturation and Value were experimentally determined using number of training videos.Then each pixels in the block that considered as smoke are summed up.In the previous studies [1] [5], if at least 10% of pixels in the moving block are smoke-colored, then the block will be considered as candidate smoke block.

C. Spatial Wavelet Analysis
In this phase, wavelet energy of each block is determined by calculating the high-low, low-high, and highhigh sub-bands from first level wavelet decomposition.Input from this process is the blocks that have been converted to grayscale after color analysis.
First level of wavelet decomposition have low-to-low (LL), high-to-low (HL), low-to-high (LH) and high-to-high (HH) sub-bands.However, only the energy from high frequency are used in this system, which is calculated using formula (2).The energy of each pixel is then accumulated using equation ( 3) to obtain the spatial wavelet energy of the block.This energy then used to distinguish if a block is a smoke candidate block or a non-smoke candidate block.

D. Spatio-temporal Wavelet Analysis
At this phase, spatio-temporal wavelet energy wavelet the smoke candidate blocks are determined.Input of this process is blocks with grayscale value that have been processed by spatial wavelet analysis.
Over a period of time, the spatial energy of the current block will be determined.Each of this value then mapped to count the number of values that classified as the outlier.A block with a non-smoke object generally will have more outlier than a smoke block.Therefore, a block will be defined as a candidate smoke block if: (11) Where this threshold was determined using number of training videos.The result of this process is the smoke blocks that have been selected based on the spatio-temporal energy it has.

E. Smoke Motion Analysis
Beside analyzing the spatio-temporal energy, smoke motion orientation will be analyzed using accumulative motion orientation.Spatio-temporal wavelet analysis results will be used as input for this process.
Accumulative motion orientation aims to determine the motion direction of smoke block.The grayscale value of the current block will be compared with the search block using equation (4).By using equations ( 5) and ( 6), the motion direction will be analyzed.And then using equation ( 7) the real motion direction will be determined.Based on [11] [14], if the direction of the block is 2, 3, or 4, then the block will be considered as candidate smoke block.
After that, each block will be accumulated for each motion direction over time t (in this research is 100 frames).Since a smoke block will generally have a direction of 2, 3, or 4, then the number of these directions will be summed up and then compared with the number of other direction.If within the period of time, the block have a number of directions 2, 3, and 4 more than the number of the other direction, then the block will be considered as a smoke block.

IV. RESULTS AND DISCUSSION
In this research, a series of testing are carried out to measure the accuracy of the system.Here are the parameters configurations used.System testing consisted of block-level testing and frame-level testing.Block-level testing is performed to measure system accuracy based on the number of smoke blocks correctly detected by the system of all blocks in the testing video.Meanwhile, frame-level testing is performed to measure system accuracy based on the number of smoke frames correctly detected by the system of all frames in the testing video.The result of this testing phase using the parameter configuration above is.

V. CONCLUSION
Based on the research and testing that has been done, it can be concluded that wavelet energy method can be used to detect smoke.The test results show the system accuracy on block level is 91.05% and system accuracy on frame level is 72.22%.This accuracy is measured using parameters configuration of frame differencing threshold 11, saturation threshold 0.37, value threshold 0.64, wavelet spatial energy range from 0.2 to 110, and 100 frame of smoke motion frame.
The challenges for this system is when the smoke is in the frame with a background that resembles smoke color.In addition, this system often incorrectly detect smoke presence in a non-smoke videos.Based on the analysis of the testing results, it is suggested that the future development would be by using a more effective smoke motion analysis method and by adding more varied dataset video, especially videos with background that resembles smoke color.

Figure 2 .
Figure 2. Image with moving object Figure 3. Marker of the moving region low sub-bands of wavelet decomposition LH = low-high sub-bands of wavelet decomposition HH = high-high sub-bands of wavelet decomposition i = row number j = column number

Figure 4 .
Figure 4. (Left) Frame and (right) graph of energy of smoke (red block and line)and non-smoke object (blue block and line)[1] as 1, 2, 3, 4, 5, 6, 7, and 8 dis = displacement defined as 1, 2, and 3 = ith row and jth column of block in frame t = ith row and jth column of block in frame = xth row and yth column of frame t = xth row and yth column of frame = time difference