An important problem in modern therapeutics at the proteomic level remains to identify therapeutic targets in a plentitude of high-throughput data from experiments relevant to a variety of diseases. This paper presents the application of novel modern control concepts, such as pinning controllability and observability applied to the glioma cancer stem cells (GSCs) protein graph network with known and novel association to glioblastoma (GBM). The theoretical frameworks provides us with the minimal number of "driver nodes", which are necessary, and their location to determine the full control over the obtained graph network in order to provide a change in the network’s dynamics from an initial state (disease) to a desired state (non-disease). The achieved results will provide biochemists with techniques to identify more metabolic regions and biological pathways for complex diseases, to design and test novel therapeutic solutions.
Graph network models in dementia have become an important computational technique in neuroscience to study fundamental organizational principles of brain structure and function of neurodegenerative diseases such as dementia. The graph connectivity is reflected in the connectome, the complete set of structural and functional connections of the graph network, which is mostly based on simple Pearson correlation links.
In contrast to simple Pearson correlation networks, the partial correlations (PC) only identify direct correlations while indirect associations are eliminated. In addition to this, the state-of-the-art techniques in brain research are based on static graph theory, which is unable to capture the dynamic behavior of the brain connectivity, as it alters with disease evolution. We propose a new research avenue in neuroimaging connectomics based on combining dynamic graph network theory and modeling strategies at different time scales. We present the theoretical framework for area aggregation and time-scale modeling in brain networks as they pertain to disease evolution in dementia. This novel paradigm is extremely powerful, since we can derive both static parameters pertaining to node and area parameters, as well as dynamic parameters, such as system’s eigenvalues. By implementing and analyzing dynamically both disease driven PC-networks and regular concentration networks, we reveal differences in the structure of these network that play an important role in the temporal evolution of this disease. The described research is key to advance biomedical research on novel disease prediction trajectories and dementia therapies.
KEYWORDS: Image filtering, Video compression, Copper, Digital filtering, Computer programming, Video coding, Video, Detection and tracking algorithms, Image classification, Image processing, Video processing, Internet technology, Internet
Nowadays, HEVC is the cutting edge encoding standard being the most efficient solution for transmission of video content. In this paper a subjective quality improvement based on pre-processing algorithms for homogeneous and chaotic regions detection is proposed and evaluated for low bit-rate applications at high resolutions. This goal is achieved by means of a texture classification applied to the input frames. Furthermore, these calculations help also reduce the complexity of the HEVC encoder. Therefore both the subjective quality and the HEVC performance are improved.
KEYWORDS: Video, Video compression, Video coding, Quantization, Copper, Visualization, Video processing, Detection and tracking algorithms, Information fusion, Computer programming, Data modeling, Data fusion
Aiming at the conflict circumstances of multi-parameter H.265/HEVC encoder system, the present paper introduces the analysis of many optimizations' set in order to improve the trade-off between quality, performance and power consumption for different reliable and accurate applications. This method is based on the Pareto optimization and has been tested with different resolutions on real-time encoders.
In the last decade, hyperspectral spectral unmixing (HSU) analysis have been applied in many remote sensing applications. For this process, the linear mixture model (LMM) has been the most popular tool used to find pure spectral constituents or endmembers and their fractional abundance in each pixel of the data set. The unmixing process consists of three stages: (i) estimation of the number of pure spectral signatures or endmembers, (ii) automatic identification of the estimated endmembers, and (iii) estimation of the fractional abundance of each endmember in each pixel of the scene. However, unmixing algorithms can be very expensive computationally, a fact that compromises their use in applications under real-time constraints. This is, mainly, due to the last two stages in the unmixing process, which are the most consuming ones. In this work, we propose parallel opencl-library- based implementations of the sum-to-one constrained least squares unmixing (P-SCLSU) algorithm to estimate the per-pixel fractional abundances by using mathematical libraries such as clMAGMA or ViennaCL. To the best of our knowledge, this kind of analysis using OpenCL libraries have not been previously conducted in the hyperspectral imaging processing literature, and in our opinion it is very important in order to achieve efficient implementations using parallel routines. The efficacy of our proposed implementations is demonstrated through Monte Carlo simulations for real data experiments and using high performance computing (HPC) platforms such as commodity graphics processing units (GPUs).
HEVC/H.265 is the most interesting and cutting-edge topic in the world of digital video compression, allowing to reduce by half the required bandwidth in comparison with the previous H.264 standard. Telemedicine services and in general any medical video application can benefit from the video encoding advances. However, the HEVC is computationally expensive to implement. In this paper a method for reducing the HEVC complexity in the medical environment is proposed. The sequences that are typically processed in this context contain several homogeneous regions. Leveraging these regions, it is possible to simplify the HEVC flow while maintaining a high-level quality. In comparison with the HM16.2 standard, the encoding time is reduced up to 75%, with a negligible quality loss. Moreover, the algorithm is straightforward to implement in any hardware platform.
Image processing can be considered as signal processing in two dimensions (2D). Filtering is one of the basic image processing operation. Filtering in frequency domain is computationally faster when compared to the corresponding spatial domain operation as the complex convolution process is modified as multiplication in frequency domain. The popular 2D transforms used in image processing are Fast Fourier Transform (FFT), Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT). The common values for resolution of an image are 640x480, 800x600, 1024x768 and 1280x1024. As it can be seen, the image formats are generally not a power of 2. So power of 2 FFT lengths are not required and these cannot be built using shorter Discrete Fourier Transform (DFT) blocks. Split radix based FFT algorithms like Good-Thomas FFT algorithm simplifies the implementation logic required for such applications and hence can be implemented in low area and power consumption and also meet the timing constraints thereby operating at high frequency. The Good-Thomas FFT algorithm which is a Prime Factor FFT algorithm (PFA) provides the means of computing DFT with least number of multiplication and addition operations. We will be providing an Altera FPGA based NIOS II custom instruction implementation of Good-Thomas FFT algorithm to improve the system performance and also provide the comparison when the same algorithm is completely implemented in software.
KEYWORDS: Video coding, Multimedia, Video, Image compression, Image storage, Video compression, Internet technology, Internet, Image processing, Video processing, Copper, Computer programming, Quantization, Detection and tracking algorithms, Plutonium, Algorithm development
HEVC/H.265 standard was released in 2013. It allows reducing by half the required bandwidth in comparison with the previous H.264 standard. This opens the door to many relevant applications in the multimedia video coding and transmission context. Thanks to the HEVC improvements, the 4K and 8K Ultra High Definition Video real time constraints can be met. Nonetheless, HEVC implementations require a vast amount of resources. In this contribution we propose intra and inter prediction techniques in order to diminish the HEVC complexity, while complying with the real time and quality constraints. The performance is noticeably increased when comparing with respect to the HM16.2 reference software as well as the x265 encoder, maintaining a similar quality too.
In the last years, hyperspectral analysis have been applied in many remote sensing applications. In fact, hyperspectral unmixing has been a challenging task in hyperspectral data exploitation. This process consists of three stages: (i) estimation of the number of pure spectral signatures or endmembers, (ii) automatic identification of the estimated endmembers, and (iii) estimation of the fractional abundance of each endmember in each pixel of the scene. However, unmixing algorithms can be computationally very expensive, a fact that compromises their use in applications under real-time constraints. In recent years, several techniques have been proposed to solve the aforementioned problem but until now, most works have focused on the second and third stages. The execution cost of the first stage is usually lower than the other stages. Indeed, it can be optional if we known a priori this estimation. However, its acceleration on parallel architectures is still an interesting and open problem. In this paper we have addressed this issue focusing on the GENE algorithm, a promising geometry-based proposal introduced in.1 We have evaluated our parallel implementation in terms of both accuracy and computational performance through Monte Carlo simulations for real and synthetic data experiments. Performance results on a modern GPU shows satisfactory 16x speedup factors, which allow us to expect that this method could meet real-time requirements on a fully operational unmixing chain.
Recent advances in heterogeneous high performance computing (HPC) have opened new avenues for demanding remote sensing applications. Perhaps one of the most popular algorithm in target detection and identification is the automatic target detection and classification algorithm (ATDCA) widely used in the hyperspectral image analysis community. Previous research has already investigated the mapping of ATDCA on graphics processing units (GPUs) and field programmable gate arrays (FPGAs), showing impressive speedup factors that allow its exploitation in time-critical scenarios. Based on these studies, our work explores the performance portability of a tuned OpenCL implementation across a range of processing devices including multicore processors, GPUs and other accelerators. This approach differs from previous papers, which focused on achieving the optimal performance on each platform. Here, we are more interested in the following issues: (1) evaluating if a single code written in OpenCL allows us to achieve acceptable performance across all of them, and (2) assessing the gap between our portable OpenCL code and those hand-tuned versions previously investigated. Our study includes the analysis of different tuning techniques that expose data parallelism as well as enable an efficient exploitation of the complex memory hierarchies found in these new heterogeneous devices. Experiments have been conducted using hyperspectral data sets collected by NASA's Airborne Visible Infra- red Imaging Spectrometer (AVIRIS) and the Hyperspectral Digital Imagery Collection Experiment (HYDICE) sensors. To the best of our knowledge, this kind of analysis has not been previously conducted in the hyperspectral imaging processing literature, and in our opinion it is very important in order to really calibrate the possibility of using heterogeneous platforms for efficient hyperspectral imaging processing in real remote sensing missions.
In the last decade, the issue of endmember variability has received considerable attention, particularly when each pixel is modeled as a linear combination of endmembers or pure materials. As a result, several models and algorithms have been developed for considering the effect of endmember variability in spectral unmixing and possibly include multiple endmembers in the spectral unmixing stage. One of the most popular approach for this purpose is the multiple endmember spectral mixture analysis (MESMA) algorithm. The procedure executed by MESMA can be summarized as follows: (i) First, a standard linear spectral unmixing (LSU) or fully constrained linear spectral unmixing (FCLSU) algorithm is run in an iterative fashion; (ii) Then, we use different endmember combinations, randomly selected from a spectral library, to decompose each mixed pixel; (iii) Finally, the model with the best fit, i.e., with the lowest root mean square error (RMSE) in the reconstruction of the original pixel, is adopted. However, this procedure can be computationally very expensive due to the fact that several endmember combinations need to be tested and several abundance estimation steps need to be conducted, a fact that compromises the use of MESMA in applications under real-time constraints. In this paper we develop (for the first time in the literature) an efficient implementation of MESMA on different platforms using OpenCL, an open standard for parallel programing on heterogeneous systems. Our experiments have been conducted using a simulated data set and the clMAGMA mathematical library. This kind of implementations with the same descriptive language on different architectures are very important in order to actually calibrate the possibility of using heterogeneous platforms for efficient hyperspectral imaging processing in real remote sensing missions.
The conditions that arise in the Cocktail Party Problem prevail across many fields creating a need for of Blind Source Separation. The need for BSS has become prevalent in several fields of work. These fields include array processing, communications, medical signal processing, and speech processing, wireless communication, audio, acoustics and biomedical engineering. The concept of the cocktail party problem and BSS led to the development of Independent Component Analysis (ICA) algorithms. ICA proves useful for applications needing real time signal processing. The goal of this research was to perform an extensive study on ability and efficiency of Independent Component Analysis algorithms to perform blind source separation on mixed signals in software and implementation in hardware with a Field Programmable Gate Array (FPGA). The Algebraic ICA (A-ICA), Fast ICA, and Equivariant Adaptive Separation via Independence (EASI) ICA were examined and compared. The best algorithm required the least complexity and fewest resources while effectively separating mixed sources. The best algorithm was the EASI algorithm. The EASI ICA was implemented on hardware with Field Programmable Gate Arrays (FPGA) to perform and analyze its performance in real time.
KEYWORDS: Information operations, Field programmable gate arrays, Motion estimation, Logic, Profiling, Embedded systems, Video coding, System on a chip, Optimization (mathematics), Clocks
This study focuses on accelerating the optimization of motion estimation algorithms, which are widely used in video
coding standards, by using both the paradigm based on Altera Custom Instructions as well as the efficient combination of
SDRAM and On-Chip memory of Nios II processor. Firstly, a complete code profiling is carried out before the
optimization in order to detect time leaking affecting the motion compensation algorithms. Then, a multi-cycle Custom
Instruction which will be added to the specific embedded design is implemented. The approach deployed is based on
optimizing SOC performance by using an efficient combination of On-Chip memory and SDRAM with regards to the
reset vector, exception vector, stack, heap, read/write data (.rwdata), read only data (.rodata), and program text (.text) in
the design. Furthermore, this approach aims to enhance the said algorithms by incorporating Custom Instructions in the
Nios II ISA. Finally, the efficient combination of both methods is then developed to build the final embedded system.
The present contribution thus facilitates motion coding for low-cost Soft-Core microprocessors, particularly the RISC
architecture of Nios II implemented in FPGA. It enables us to construct an SOC which processes 50×50 @ 180 fps.
Nowadays vision systems are used with countless purposes. Moreover, the motion estimation is a discipline that allow to extract relevant information as pattern segmentation, 3D structure or tracking objects. However, the real-time requirements in most applications has limited its consolidation, considering the adoption of high performance systems to meet response times. With the emergence of so-called highly parallel devices known as accelerators this gap has narrowed. Two extreme endpoints in the spectrum of most common accelerators are Field Programmable Gate Array (FPGA) and Graphics Processing Systems (GPU), which usually offer higher performance rates than general propose processors. Moreover, the use of GPUs as accelerators involves the efficient exploitation of any parallelism in the target application. This task is not easy because performance rates are affected by many aspects that programmers should overcome. In this paper, we evaluate OpenACC standard, a programming model with directives which favors porting any code to a GPU in the context of motion estimation application. The results confirm that this programming paradigm is suitable for this image processing applications achieving a very satisfactory acceleration in convolution based problems as in the well-known Lucas & Kanade method.
This paper focuses on the optimization of video coding standards motion estimation algorithms using Altera Custom
Instructions based-paradigm and the combination of SDRAM with On-Chip memory in NIOS II processors. On one
hand a complete algorithm profiling is achieved before the optimization, in order to find the code time leaks, afterward is
developing a custom instruction set which will be added to the specific embedded design enhancing the original system.
On the other hand, all possible permitted memories combinations between On-Chip memory and SDRAM have been
tested for achieving the best performance combination. The final performance of the final design (memory optimization
and custom instruction acceleration) is shown. This contribution, thus, outlines a low cost system, mapped on a Very
Large Scale Integration (VLSI) technology which accelerates software algorithms by converting them to custom
hardware logic block and shows the best combination between On-Chip memory and SDRAM for the NIOS II
processor.
This paper compares FPGA-based full pipelined multiplierless FIR filter design options. Comparison of Distributed
Arithmetic (DA), Common Sub-Expression (CSE) sharing and n-dimensional Reduced Adder Graph (RAG-n)
multiplierless filter design methods in term of size, speed, and A*T product are provided. Since DA designs are table-based
and CSE/RAG-n designs are adder-based, FPGA synthesis design data are used for a realistic comparison.
Superior results of a genetic algorithm based optimization of pipeline registers and non-output fundamental coefficients
are shown. FIR filters (posted as open source by Kastner et al.) for filters in the length from 6 to 151 coefficients are
used.
KEYWORDS: Intellectual property, Reverse engineering, Digital watermarking, Transform theory, Embedded systems, Opacity, Java, Field programmable gate arrays, Integrated circuit design, System on a chip
One of the big challenges in the design of embedded systems today is how to combine design reuse and intellectual
property protection (IPP). Strong IP schemes such as hardware dongle or layout watermarking usually have a very
limited design reuse for different FPGA/ASIC design platforms. Some techniques also do not fit well with protection of
software in embedded microprocessors. Another approach to IPP that allows an easy design reuse and has low costs but
a somehow reduced security is code "obfuscation." Obfuscation is a method to hide the design concept, or program
algorithm included in the C or HDL source by using one or more transformations of the original code. Obfuscation
methods include, for instance, renaming identifiers, removing comments or formatting of the code. More sophisticated
obfuscation methods include data splitting or merging, and control flow changes. This paper shows strength and
weakness of method obfuscating C, VHDL and Verilog code.
A new approach to optimize the parameters of a gradient-based optical flow model using a parallel genetic
algorithm (GA) is proposed. The main characteristics of the optical flow algorithm are its bio-inspiration
and robustness against contrast, static patterns and noise, besides working consistently with several optical
illusions where other algorithms fail. This model depends on many parameters which conform the number of
channels, the orientations required, the length and shape of the kernel functions used in the convolution stage,
among many more. The GA is used to find a set of parameters which improve the accuracy of the optical
flow on inputs where the ground-truth data is available. This set of parameters helps to understand which
of them are better suited for each type of inputs and can be used to estimate the parameters of the optical
flow algorithm when used with videos that share similar characteristics. The proposed implementation takes
into account the embarrassingly parallel nature of the GA and uses the OpenMP Application Programming
Interface (API) to speedup the process of estimating an optimal set of parameters. The information obtained
in this work can be used to dynamically reconfigure systems, with potential applications in robotics, medical
imaging and tracking.
KEYWORDS: Motion estimation, Video coding, Digital signal processing, Video compression, Standards development, Video, Field programmable gate arrays, Data storage, Logic, Semantic video
This paper focuses on the hardware acceleration of motion compensation techniques suitable for the MPEG video
compression. A plethora of representative motion estimation search algorithms and the new perspectives are introduced.
The methods and designs described here are qualified for medical imaging area where are involved larger images. The
structure of the processing systems considered has a good fit for reconfigurable acceleration. The system is based in a
platform like FPGA working with the Nios II Microprocessor platform applying C2H acceleration. The paper shows the
results in terms of performance and resources needed.
Vision is typically considered the primary and most important of all the human senses. Motion detection, being a noncontact
sense, allows us to extract vast quantities of information about our environment remotely and safely. The main
motivation of this research contribution is the implementation of an architecture of a biologically inspired motion
algorithm tuned specially to correct optical flow (motion) to breast MRI. Neuromorphic engineering is used, borrowing
nature's templates as inspiration in the design of algorithms and architectures. The architectures used can be enhanced
using psychophysical and bioinspired properties according to biological vision in order to mimic the performance of the
mammalians.
The quadratic sieve (QS) algorithm is one of the most powerful algorithms to factor large composite primes used to
break RSA cryptographic systems. The hardware structure of the QS algorithm seems to be a good fit for FPGA
acceleration. Our new ε-QS algorithm further simplifies the hardware architecture making it an even better candidate for
C2H acceleration. This paper shows our design results in FPGA resource and performance when implementing very long
arithmetic on the Nios microprocessor platform with C2H acceleration for different libraries (GMP, LIP, FLINT, NRMP)
and QS architecture choices for factoring 32-2048 bit RSA numbers.
Since the world presents a dynamically changing environment, we need synthetic systems that can process and respond to motion. The main contribution of this work is the efficient implementation of a biologically inspired DSP architecture for gradient motion estimation that borrows nature templates as inspiration and makes use of an specific model of human visual motion perception: Multi Channel Gradient Model. This model can be enhanced using psychophysical and
bioinspired properties according to biological vision in order to mimic the behavior and the performance of the mammalians. The architecture is designed with an asynchronous pipeline, chaining several signal processing stages. Experimental results and resource consumption are discussed, analyzing the associated customization of the system. This work concludes with several comparisons with the actual contributions and examples for synthetic sequences and real image applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.