Скачать книгу

image coding standard on the Internet, was published in 1992. MPEG1, the first digital video compression standard by ISO, was standardized in 1993. The IEEE International Conference on Image Processing (ICIP), the flagship image processing conference held annually for the IEEE Signal Processing Society (SPS), was also started in 1993 and has been in existence for 27 years, making it older than many image processing researchers now studying in graduate schools! Given the topic’s maturity, it is a legitimate question to ask if yet another book on image processing is warranted. As co-editors of this book, we emphatically answer this question with a resounding “Yes”. We will first discuss the following recent technological trends, which also serve as motivations for the creation of this publication.

      1) Sensing and Display Technologies: The advent of image sensing technologies, such as active depth sensors and display technologies like head-mounted displays (HMD), in the last decade alone, means that the nature of a digital image has drastically changed. Beyond higher spatial resolution and bit-depth per pixel, a modern imaging sensor can also acquire scene depth, hyper-spectral properties, etc. Further, often acquired image data is not represented as a traditional 2D array of pixel information, but in an alternative form, such as light fields and 3D point clouds. This means that the processing tools must flexibly adapt to richer and evolving imaging contents and formats.

      3) Deep Neural Networks: Without a doubt, the singular seismic paradigm shift in data science in the last decade is deep learning. Using layers of convolutional filters, pointwise nonlinearities and pooling functions, deep neural network (DNN) architectures like convolutional neural networks (CNN) have demonstrated superior performance in a wide range of imaging tasks from denoising to classification, when a large volume of labeled data is available for training (Vemulapalli et al. 2016; Zhang et al. 2017). When labeled training data is scarce, or when the underlying data kernel is irregular (thus complicating the training of convolutional filters and the selection of pooling operators), how to best design and construct DNN for a targeted image application is a challenging problem. Moreover, a CNN purely trained from labeled data often remains a “black box”, i.e. the learned operators like filtering remain unexplainable.

      Specifically, the content of this book is structured into two parts:

      1) The first part of the book discusses the fundamental GSP theories. Chapter 1, titled “Graph Spectral Filtering” by Y. Tanaka, reviews the basics of graph filtering such as graph transforms and wavelets. Chapter 2, titled “Graph Learning” by X. Dong, D. Thanou, M. Rabbat and P. Frossard, reviews recent techniques to learn an underlying graph structure given a set of observable data. Chapter 3, titled “Graph Neural Networks” by G. Fracastoro and D. Valsesia, overviews recent works generalizing DNN architectures to the graph data domain, where input signals reside on irregular graph structures.

      2) The second part of the book reviews different imaging applications of GSP. Chapters 4 and 5, titled “Graph Specral Image and Video Compression” by H.E. Egilmez, Y.-H. Chao and A. Ortega and “Graph Spectral 3D Image Compression” by T. Maugey, M. Rizkallah, N. M. Bidgoli, A. Roumy and C. Guillemot, focus on the design and applications of GSP tools for the compression of traditional images/videos and 3D images, respectively. Chapter 6, titled “Graph Spectral Image Restoration” by J. Pang and J. Zeng, focuses on the general recovery of corrupted images, e.g. image denoising and deblurring. As a new imaging modality, Chapter 7, titled “Graph Spectral Point Cloud Processing” by W. Hu, S. Chen and D. Tian, focuses on the processing of 3D point clouds for applications, such as low-level restoration and high-level unsupervised feature learning. Chapters 8 and 9, titled “Graph Spectral Image Segmentation” by M. Ng and “Graph Spectral Image Classification” by M. Ye, V. Stankovic, L. Stankovic and G. Cheung, narrow the discussion specifically to segmentation and classification, respectively, two popular research topics in the computer vision community. Finally, Chapter 10, titled “Graph Neural Networks for Image Processing” by G. Fracastoro and D. Valsesia, reviews the growing efforts to employ recent GNN architectures for conventional imaging tasks such as denoising.

      Before we jump into the various chapters, we begin with the basic definitions in GSP that will be used throughout the book. Specifically, we formally define a graph, graph spectrum, variation operators and graph signal smoothness priors in the following sections.

      A graph G(V, E, W) contains a set V of N nodes and a set E of M edges. While directed graphs are also possible, in this book we more commonly assume an undirected graph, where each existing edge (i, j) ∈ E is undirected and contains an edge weight wi,j R, which is typically positive. A large positive edge weight wi,j would mean that samples at nodes i and j are expected to be similar/correlated.

      There are many ways to compute appropriate edge weights. Especially common for images, edge weight wi,j can be computed using a Gaussian kernel, as done in the bilateral filter (Tomasi and Manduchi 1998):

      [I.1]

      where li R2 is the location of pixel i on the 2D image grid, xi R is the intensity of pixel i, and

and
are two parameters. Hence, 0 wi,j 1. Larger

Скачать книгу