Real Time Object Tracking in H.264/ AVC Using Polar Vector Median and Block Coding Modes
This paper presents a real time video surveillance system which is capable of tracking multiple real time objects using Polar Vector Median (PVM) and Block Coding Modes (BCM) with Global Motion Compensation (GMC). This strategy works in the packed area and furthermore utilizes the movement vectors and BCM from the compressed bit stream to perform real time object tracking. We propose to do this in view of the neighboring Motion Vectors (MVs) using a method called PVM. Since GM adds to the object’s native motion, for accurate tracking, it is important to remove GM from the MV field prior to further processing. The proposed method is tested on a number of standard sequences and the results show its advantages over some of the current modern methods.
Key Frame Based Video Summarization via Dependency Optimization
As a rapid growth of digital videos and data
communications, video summarization that provides a shorter version
of the video for fast video browsing and retrieval is necessary.
Key frame extraction is one of the mechanisms to generate video
summary. In general, the extracted key frames should both represent
the entire video content and contain minimum redundancy. However,
most of the existing approaches heuristically select key frames; hence,
the selected key frames may not be the most different frames and/or
not cover the entire content of a video. In this paper, we propose
a method of video summarization which provides the reasonable
objective functions for selecting key frames. In particular, we apply
a statistical dependency measure called quadratic mutual informaion
as our objective functions for maximizing the coverage of the
entire video content as well as minimizing the redundancy among
selected key frames. The proposed key frame extraction algorithm
finds key frames as an optimization problem. Through experiments,
we demonstrate the success of the proposed video summarization
approach that produces video summary with better coverage of
the entire video content while less redundancy among key frames
comparing to the state-of-the-art approaches.
Extended Constraint Mask Based One-Bit Transform for Low-Complexity Fast Motion Estimation
In this paper, an improved motion estimation (ME) approach based on weighted constrained one-bit transform is proposed for block-based ME employed in video encoders. Binary ME approaches utilize low bit-depth representation of the original image frames with a Boolean exclusive-OR based hardware efficient matching criterion to decrease computational burden of the ME stage. Weighted constrained one-bit transform (WC‑1BT) based approach improves the performance of conventional C-1BT based ME employing 2-bit depth constraint mask instead of a 1-bit depth mask. In this work, the range of constraint mask is further extended to increase ME performance of WC-1BT approach. Experiments reveal that the proposed method provides better ME accuracy compared existing similar ME methods in the literature.
Multi-Layer Perceptron and Radial Basis Function Neural Network Models for Classification of Diabetic Retinopathy Disease Using Video-Oculography Signals
Diabetes Mellitus (Diabetes) is a disease based on insulin hormone disorders and causes high blood glucose. Clinical findings determine that diabetes can be diagnosed by electrophysiological signals obtained from the vital organs. 'Diabetic Retinopathy' is one of the most common eye diseases resulting on diabetes and it is the leading cause of vision loss due to structural alteration of the retinal layer vessels. In this study, features of horizontal and vertical Video-Oculography (VOG) signals have been used to classify non-proliferative and proliferative diabetic retinopathy disease. Twenty-five features are acquired by using discrete wavelet transform with VOG signals which are taken from 21 subjects. Two models, based on multi-layer perceptron and radial basis function, are recommended in the diagnosis of Diabetic Retinopathy. The proposed models also can detect level of the disease. We show comparative classification performance of the proposed models. Our results show that proposed the RBF model (100%) results in better classification performance than the MLP model (94%).
Human Behavior Modeling in Video Surveillance of Conference Halls
In this paper, we present a human behavior modeling approach in videos scenes. This approach is used to model the normal behaviors in the conference halls. We exploited the Probabilistic Latent Semantic Analysis technique (PLSA), using the 'Bag-of-Terms' paradigm, as a tool for exploring video data to learn the model by grouping similar activities. Our term vocabulary consists of 3D spatio-temporal patch groups assigned by the direction of motion. Our video representation ensures the spatial information, the object trajectory, and the motion. The main importance of this approach is that it can be adapted to detect abnormal behaviors in order to ensure and enhance human security.
Lecture Video Indexing and Retrieval Using Topic Keywords
In this paper, we propose a framework to help users to search and retrieve the portions in the lecture video of their interest. This is achieved by temporally segmenting and indexing the lecture video using the topic keywords. We use transcribed text from the video and documents relevant to the video topic extracted from the web for this purpose. The keywords for indexing are found by applying the non-negative matrix factorization (NMF) topic modeling techniques on the web documents. Our proposed technique first creates indices on the transcribed documents using the topic keywords, and these are mapped to the video to find the start and end time of the portions of the video for a particular topic. This time information is stored in the index table along with the topic keyword which is used to retrieve the specific portions of the video for the query provided by the users.
Smartphone Video Source Identification Based on Sensor Pattern Noise
An increasing number of mobile devices with integrated
cameras has meant that most digital video comes from these devices.
These digital videos can be made anytime, anywhere and for different
purposes. They can also be shared on the Internet in a short period
of time and may sometimes contain recordings of illegal acts. The
need to reliably trace the origin becomes evident when these videos
are used for forensic purposes. This work proposes an algorithm
to identify the brand and model of mobile device which generated
the video. Its procedure is as follows: after obtaining the relevant
video information, a classification algorithm based on sensor noise
and Wavelet Transform performs the aforementioned identification
process. We also present experimental results that support the validity
of the techniques used and show promising results.
Motion Estimator Architecture with Optimized Number of Processing Elements for High Efficiency Video Coding
Motion estimation occupies the heaviest computation in HEVC (high efficiency video coding). Many fast algorithms such as TZS (test zone search) have been proposed to reduce the computation. Still the huge computation of the motion estimation is a critical issue in the implementation of HEVC video codec. In this paper, motion estimator architecture with optimized number of PEs (processing element) is presented by exploiting early termination. It also reduces hardware size by exploiting parallel processing. The presented motion estimator architecture has 8 PEs, and it can efficiently perform TZS with very high utilization of PEs.
H.264 Video Privacy Protection Method Using Regions of Interest Encryption
Like a closed-circuit television (CCTV), video surveillance system is widely placed for gathering video from unspecified people to prevent crime, surveillance, or many other purposes. However, abuse of CCTV brings about concerns of personal privacy invasions. In this paper, we propose an encryption method to protect personal privacy system in H.264 compressed video bitstream with encrypting only regions of interest (ROI). There is no need to change the existing video surveillance system. In addition, encrypting ROI in compressed video bitstream is a challenging work due to spatial and temporal drift errors. For this reason, we propose a novel drift mitigation method when ROI is encrypted. The proposed method was implemented by using JM reference software based on the H.264 compressed videos, and experimental results show the verification of our proposed methods and its effectiveness.
Design and Implementation of a Counting and Differentiation System for Vehicles through Video Processing
This paper presents a self-sustaining mobile system for
counting and classification of vehicles through processing video. It
proposes a counting and classification algorithm divided in four steps
that can be executed multiple times in parallel in a SBC (Single
Board Computer), like the Raspberry Pi 2, in such a way that it
can be implemented in real time. The first step of the proposed
algorithm limits the zone of the image that it will be processed.
The second step performs the detection of the mobile objects using
a BGS (Background Subtraction) algorithm based on the GMM
(Gaussian Mixture Model), as well as a shadow removal algorithm
using physical-based features, followed by morphological operations.
In the first step the vehicle detection will be performed by using
edge detection algorithms and the vehicle following through Kalman
filters. The last step of the proposed algorithm registers the vehicle
passing and performs their classification according to their areas.
An auto-sustainable system is proposed, powered by batteries and
photovoltaic solar panels, and the data transmission is done through
GPRS (General Packet Radio Service)eliminating the need of using
external cable, which will facilitate it deployment and translation to
any location where it could operate. The self-sustaining trailer will
allow the counting and classification of vehicles in specific zones
with difficult access.
The Video Database for Teaching and Learning in Football Refereeing
The following paper describes the video database tool used by the Fédération Internationale de Football Association (FIFA) as part of the research project developed in collaboration with the Carlos III University of Madrid. The database project began in 2012, with the aim of creating an educational tool for the training of instructors, referees and assistant referees, and it has been used in all FUTURO III courses since 2013. The platform now contains 3,135 video clips of different match situations from FIFA competitions. It has 1,835 users (FIFA instructors, referees and assistant referees). In this work, the main features of the database are described, such as the use of a search tool and the creation of multimedia presentations and video quizzes. The database has been developed in MySQL, ActionScript, Ruby on Rails and HTML. This tool has been rated by users as "very good" in all courses, which prompt us to introduce it as an ideal tool for any other sport that requires the use of video analysis.
Video Based Ambient Smoke Detection By Detecting Directional Contrast Decrease
Fire-related incidents account for extensive loss of life and
material damage. Quick and reliable detection of occurring fires has high
real world implications. Whereas a major research focus lies on the detection
of outdoor fires, indoor camera-based fire detection is still an open issue.
Cameras in combination with computer vision helps to detect flames and
smoke more quickly than conventional fire detectors. In this work, we present
a computer vision-based smoke detection algorithm based on contrast changes
and a multi-step classification. This work accelerates computer vision-based
fire detection considerably in comparison with classical indoor-fire detection.
Real Time Video Based Smoke Detection Using Double Optical Flow Estimation
In this paper, we present a video based smoke detection
algorithm based on TVL1 optical flow estimation. The main part
of the algorithm is an accumulating system for motion angles and
upward motion speed of the flow field. We optimized the usage of
TVL1 flow estimation for the detection of smoke with very low smoke
density. Therefore, we use adapted flow parameters and estimate the
flow field on difference images. We show in theory and in evaluation
that this improves the performance of smoke detection significantly.
We evaluate the smoke algorithm using videos with different smoke
densities and different backgrounds. We show that smoke detection
is very reliable in varying scenarios. Further we verify that our
algorithm is very robust towards crowded scenes disturbance videos.
Surveillance Video Summarization Based on Histogram Differencing and Sum Conditional Variance
For more efficient and fast video summarization, this paper presents a surveillance video summarization method. The presented method works to improve video summarization technique. This method depends on temporal differencing to extract most important data from large video stream. This method uses histogram differencing and Sum Conditional Variance which is robust against to illumination variations in order to extract motion objects. The experimental results showed that the presented method gives better output compared with temporal differencing based summarization techniques.
The Role of Online Videos in Undergraduate Casual-Leisure Information Behaviors
This study describes undergraduate casual-leisure information behaviors relevant to online videos. Diaries and in-depth interviews were used to collect data. Twenty-four undergraduates participated in this study (9 men, 15 women; all were aged 18–22 years). This study presents a model of casual-leisure information behaviors and contributes new insights into user experience in casual-leisure settings, such as online video programs, with implications for other information domains.
Shot Boundary Detection Using Octagon Square Search Pattern
In this paper, a shot boundary detection method is presented using octagon square search pattern. The color, edge, motion and texture features of each frame are extracted and used in shot boundary detection. The motion feature is extracted using octagon square search pattern. Then, the transition detection method is capable of detecting the shot or non-shot boundaries in the video using the feature weight values. Experimental results are evaluated in TRECVID video test set containing various types of shot transition with lighting effects, object and camera movement within the shots. Further, this paper compares the experimental results of the proposed method with existing methods. It shows that the proposed method outperforms the state-of-art methods for shot boundary detection.
A Video Watermarking Algorithm Based on Chaotic and Wavelet Neural Network
This paper presented a video watermarking algorithm based on wavelet chaotic neural network. First, to enhance binary image’s security, the algorithm encrypted it with double chaotic based on Arnold and Logistic map, Then, the host video was divided into some equal frames and distilled the key frame through chaotic sequence which generated by Logistic. Meanwhile, we distilled the low frequency coefficients of luminance component and self-adaptively embedded the processed image watermark into the low frequency coefficients of the wavelet transformed luminance component with the wavelet neural network. The experimental result suggested that the presented algorithm has better invisibility and robustness against noise, Gaussian filter, rotation, frame loss and other attacks.
Violent Videogame Playing and Its Relations to Antisocial Behaviors
The presented study focuses on relations between violent videogames playing and various types of antisocial behavior, namely bullying (verbal, indirect, and physical), physical aggression and delinquency. Relevant relationships were also examined with respect to gender. Violent videogames exposure (VGV) was measured by respondents’ most favored games and self-evaluation of its level of violence and frequency of playing. Antisocial behaviors were assessed by self-report questionnaires. The research sample consisted of 333 (166 males, 167 females) primary and secondary school students at the age between 10 and 19 years (m=14.98, sd=1.77). It was found that violent videogames playing is associated with physical aggression (rho=0.288, 95% CI [0.169;0.400]) and bullying (rho=0.369, 95% CI [0.254;0.476]). By means of gender, these relations were slightly weaker in males (VGV - physical aggression: rho=0.104, 95% CI [-0.061;0.264], VGV – bullying: rho=.200, 95% CI [0.032;0.356]) than in females (VGV - physical aggression: rho=0.257, 95% CI [0.089;0.411], VGV – bullying: rho=0.279, 95% CI [0.110;0.432]).
Parenting Styles and Their Relation to Videogame Addiction
We try to identify the role of various aspects of parenting style in the phenomenon of videogame playing addiction. Relevant self-report questionnaires were part of a wider set of methods focused on the constructs related to videogame playing. The battery of methods was administered in school settings in paper and pencil form. The research sample consisted of 333 (166 males, 167 females) elementary and high school students at the age between 10 and 19 years (m=14.98, sd=1.77). Using stepwise regression analysis, we assessed the influence of demographic variables (gender and age) and parenting styles. Age and gender together explained 26.3% of game addiction variance (F(2,330)=58.81, p<.01). By adding four aspect of parenting styles (inconsistency, involvement, control, and warmth) another 10.2% of variance was explained (∆F(4,326)=13.09, p<.01). The significant predictor was gender of the respondent, where males scored higher on game addiction scale (B=0.70, p<.01), age (β=-0.18, p<.01), where younger children showed higher level of addiction, and parental inconsistency (β=0.30, p<.01), where the higher the inconsistency in upbringing, the more developed game playing addiction.
A Scalable Media Job Framework for an Open Source Search Engine
This paper explores efficient ways to implement various
media-updating features like news aggregation, video conversion,
and bulk email handling. All of these jobs share the property
that they are periodic in nature, and they all benefit from being
handled in a distributed fashion. The data for these jobs also often
comes from a social or collaborative source. We isolate the class of
periodic, one round map reduce jobs as a useful setting to describe
and handle media updating tasks. As such tasks are simpler than
general map reduce jobs, programming them in a general map
reduce platform could easily become tedious. This paper presents
a MediaUpdater module of the Yioop Open Source Search Engine
Web Portal designed to handle such jobs via an extension of a
PHP class. We describe how to implement various media-updating
tasks in our system as well as experiments carried out using these
implementations on an Amazon Web Services cluster.
Anonymous Editing Prevention Technique Using Gradient Method for High-Quality Video
Since the advances in digital imaging technologies have led to
development of high quality digital devices, there are a lot of illegal copies
of copyrighted video content on the Internet. Also, unauthorized editing is
occurred frequently. Thus, we propose an editing prevention technique for
high-quality (HQ) video that can prevent these illegally edited copies from
spreading out. The proposed technique is applied spatial and temporal gradient
methods to improve the fidelity and detection performance. Also, the scheme
duplicates the embedding signal temporally to alleviate the signal reduction
caused by geometric and signal-processing distortions. Experimental results
show that the proposed scheme achieves better performance than previously
proposed schemes and it has high fidelity. The proposed scheme can be used
in unauthorized access prevention method of visual communication or traitor
tracking applications which need fast detection process to prevent illegally
edited video content from spreading out.
Evaluation of Cognitive Benefits among Differently Abled Subjects with Video Game as Intervention
In this study, the potential benefits of playing action
video game among congenitally deaf and dumb subjects is reported in
terms of EEG ratio indices. The frontal and occipital lobes are
associated with development of motor skills, cognition, and visual
information processing and color recognition. The sixteen hours of
First-Person shooter action video game play resulted in the increase
of the ratios β/(α+θ) and β/θ in frontal and occipital lobes. This can
be attributed to the enhancement of certain aspect of cognition among
deaf and dumb subjects.
A Low-Cost Vision-Based Unmanned Aerial System for Extremely Low-Light GPS-Denied Navigation and Thermal Imaging
This paper presents the design and implementation
details of a complete unmanned aerial system (UAS) based
on commercial-off-the-shelf (COTS) components, focusing on
safety, security, search and rescue scenarios in GPS-denied
environments. In particular, The aerial platform is capable
of semi-autonomously navigating through extremely low-light,
GPS-denied indoor environments based on onboard sensors only,
including a downward-facing optical flow camera. Besides, an
additional low-cost payload camera system is developed to stream
both infra-red video and visible light video to a ground station in
real-time, for the purpose of detecting sign of life and hidden humans.
The total cost of the complete system is estimated to be $1150,
and the effectiveness of the system has been tested and validated
in practical scenarios.
Efficient Utilization of Unmanned Aerial Vehicle (UAV) for Fishing through Surveillance for Fishermen
UAV’s are small remote operated or automated aerial
surveillance systems without a human pilot aboard. UAV’s generally
finds its use in military and special operation application, a recent
growing trend in UAV’s finds its application in several civil and nonmilitary
works such as inspection of power or pipelines. The
objective of this paper is the augmentation of a UAV in order to
replace the existing expensive sonar (Sound Navigation And
Ranging) based equipment amongst small scale fisherman, for whom
access to sonar equipment are restricted due to limited economic
resources. The surveillance equipment’s present in the UAV will
relay data and GPS (Global Positioning System) location onto a
receiver on the fishing boat using RF signals, using which the
location of the schools of fishes can be found. In addition to this, an
emergency beacon system is present for rescue operations and drone
A Four-Step Ortho-Rectification Procedure for Geo-Referencing Video Streams from a Low-Cost UAV
In this paper, we present a four-step ortho-rectification
procedure for real-time geo-referencing of video data from a low-cost
UAV equipped with a multi-sensor system. The basic procedures for
the real-time ortho-rectification are: (1) decompilation of the video
stream into individual frames; (2) establishing the interior camera
orientation parameters; (3) determining the relative orientation
parameters for each video frame with respect to each other; (4)
finding the absolute orientation parameters, using a self-calibration
bundle and adjustment with the aid of a mathematical model. Each
ortho-rectified video frame is then mosaicked together to produce a
mosaic image of the test area, which is then merged with a well
referenced existing digital map for the purpose of geo-referencing
and aerial surveillance. A test field located in Abuja, Nigeria was
used to evaluate our method. Video and telemetry data were collected
for about fifteen minutes, and they were processed using the four-step
ortho-rectification procedure. The results demonstrated that the
geometric measurement of the control field from ortho-images is
more accurate when compared with those from original perspective
images when used to pin point the exact location of targets on the
video imagery acquired by the UAV. The 2-D planimetric accuracy
when compared with the 6 control points measured by a GPS receiver
is between 3 to 5 metres.
The Impact of Temporal Impairment on Quality of Experience (QoE) in Video Streaming: A No Reference (NR) Subjective and Objective Study
Live video streaming is one of the most widely used
service among end users, yet it is a big challenge for the network
operators in terms of quality. The only way to provide excellent
Quality of Experience (QoE) to the end users is continuous
monitoring of live video streaming. For this purpose, there are several
objective algorithms available that monitor the quality of the video in
a live stream. Subjective tests play a very important role in fine
tuning the results of objective algorithms. As human perception is
considered to be the most reliable source for assessing the quality of a
video stream subjective tests are conducted in order to develop more
reliable objective algorithms. Temporal impairments in a live video
stream can have a negative impact on the end users. In this paper we
have conducted subjective evaluation tests on a set of video
sequences containing temporal impairment known as frame freezing.
Frame Freezing is considered as a transmission error as well as a
hardware error which can result in loss of video frames on the
reception side of a transmission system. In our subjective tests, we
have performed tests on videos that contain a single freezing event
and also for videos that contain multiple freezing events. We have
recorded our subjective test results for all the videos in order to give a
comparison on the available No Reference (NR) objective
algorithms. Finally, we have shown the performance of no reference
algorithms used for objective evaluation of videos and suggested the
algorithm that works better. The outcome of this study shows the
importance of QoE and its effect on human perception. The results
for the subjective evaluation can serve the purpose for validating
Video Sharing System Based on Wi-Fi Camera
This paper introduces a video sharing platform based
on WiFi, which consists of camera, mobile phone and PC server. This
platform can receive wireless signal from the camera and show the live
video on the mobile phone captured by camera. In addition, it is able to
send commands to camera and control the camera’s holder to rotate.
The platform can be applied to interactive teaching and dangerous
area’s monitoring and so on. Testing results show that the platform can
share the live video of mobile phone. Furthermore, if the system’s PC
server and the camera and many mobile phones are connected
together, it can transfer photos concurrently.
Secure Low-Bandwidth Video Streaming through Reliable Multipath Propagation in MANETs
Most of the existing video streaming protocols
provide video services without considering security aspects in
decentralized mobile ad-hoc networks. The security policies adapted
to the currently existing non-streaming protocols, do not comply with
the live video streaming protocols resulting in considerable
vulnerability, high bandwidth consumption and unreliability which
cause severe security threats, low bandwidth and error prone
transmission respectively in video streaming applications. Therefore
a synergized methodology is required to reduce vulnerability and
bandwidth consumption, and enhance reliability in the video
streaming applications in MANET. To ensure the security measures
with reduced bandwidth consumption and improve reliability of the
video streaming applications, a Secure Low-bandwidth Video
Streaming through Reliable Multipath Propagation (SLVRMP)
protocol architecture has been proposed by incorporating the two
algorithms namely Secure Low-bandwidth Video Streaming
Algorithm and Reliable Secure Multipath Propagation Algorithm
using Layered Video Coding in non-overlapping zone routing
network topology. The performances of the proposed system are
compared to those of the other existing secure multipath protocols
Sec-MR, SPREAD using NS 2.34 and the simulation results show
that the performances of the proposed system get considerably
Automatic Motion Trajectory Analysis for Dual Human Interaction Using Video Sequences
Advance in techniques of image and video processing has enabled the development of intelligent video surveillance systems. This study was aimed to automatically detect moving human objects and to analyze events of dual human interaction in a surveillance scene. Our system was developed in four major steps: image preprocessing, human object detection, human object tracking, and motion trajectory analysis. The adaptive background subtraction and image processing techniques were used to detect and track moving human objects. To solve the occlusion problem during the interaction, the Kalman filter was used to retain a complete trajectory for each human object. Finally, the motion trajectory analysis was developed to distinguish between the interaction and non-interaction events based on derivatives of trajectories related to the speed of the moving objects. Using a database of 60 video sequences, our system could achieve the classification accuracy of 80% in interaction events and 95% in non-interaction events, respectively. In summary, we have explored the idea to investigate a system for the automatic classification of events for interaction and non-interaction events using surveillance cameras. Ultimately, this system could be incorporated in an intelligent surveillance system for the detection and/or classification of abnormal or criminal events (e.g., theft, snatch, fighting, etc.).
Toward Indoor and Outdoor Surveillance Using an Improved Fast Background Subtraction Algorithm
The detection of moving objects from a video image
sequences is very important for object tracking, activity recognition,
and behavior understanding in video surveillance.
The most used approach for moving objects detection / tracking is
background subtraction algorithms. Many approaches have been
suggested for background subtraction. But, these are illumination
change sensitive and the solutions proposed to bypass this problem
are time consuming.
In this paper, we propose a robust yet computationally efficient
background subtraction approach and, mainly, focus on the ability to
detect moving objects on dynamic scenes, for possible applications in
complex and restricted access areas monitoring, where moving and
motionless persons must be reliably detected. It consists of three
main phases, establishing illumination changes invariance,
background/foreground modeling and morphological analysis for
We handle illumination changes using Contrast Limited Histogram
Equalization (CLAHE), which limits the intensity of each pixel to
user determined maximum. Thus, it mitigates the degradation due to
scene illumination changes and improves the visibility of the video
signal. Initially, the background and foreground images are extracted
from the video sequence. Then, the background and foreground
images are separately enhanced by applying CLAHE.
In order to form multi-modal backgrounds we model each channel
of a pixel as a mixture of K Gaussians (K=5) using Gaussian Mixture
Model (GMM). Finally, we post process the resulting binary
foreground mask using morphological erosion and dilation
transformations to remove possible noise.
For experimental test, we used a standard dataset to challenge the
efficiency and accuracy of the proposed method on a diverse set of