IEEE Transactions on Neural Networks, Special Issue on Neural Networks and Pattern Recognition, Volume 8, Number 1, pp. 98–113, 1997. Copyright IEEE.
Face Recognition: A Convolutional Neural Network Approach
Steve Lawrence , C. Lee Giles , Ah Chung Tsoi , Andrew D. Back
,
NEC Research Institute, 4 Independence Way, Princeton, NJ 08540
Electrical and Computer Engineering, University of Queensland, St. Lucia, Australia
Abstract
Faces represent complex, multidimensional, meaningful visual stimuli and developing a computational model for face recognition is difficult [43]. We present a hybrid neural network solution which compares favorably with other methods. The system combines local image sampling, a self-organizing map neural network, and a convolutional neural network. The self-organizing map provides a quantization of the image samples into a topological space where inputs that are nearby in the original space are also nearby in the output space, thereby providing dimensionality reduction and invariance to minor changes in the image sample, and the convolutional neural network provides for partial invariance to translation, rotation, scale, and deformation. The convolutional network extracts successively larger features in a hierarchical set of layers. We present results using the Karhunen-Loe`ve transform in place of the self-organizing map, and a multi-layer perceptron in place of the convolutional network. The Karhunen-Loe`ve transform performs almost as well (5.3% error versus 3.8%). The multi-layer perceptron performs very poorly (40% error versus 3.8%). The method is capable of rapid classification, requires only fast, approximate normalization and preprocessing, and consistently exhibits better classification performance than the eigenfaces approach [43] on the database considered as the number of images per person in the training database is varied from 1 to 5. With 5 images per person the proposed method and eigenfaces result in 3.8% and 10.5% error respectively. The recognizer provides a measure of confidence in its output and classification error approaches zero when rejecting as few as 10% of the examples. We use a database of 400 images of 40 individuals which contains quite a high degree of variability in expression, pose, and facial details. We analyze computational complexity and discuss how new classes could be added to the trained recognizer.
1 Introduction
The requirement for reliable personal identification in computerized access control has resulted in an increased interest in biometrics1. Biometrics being investigated include fingerprints [4], speech [7], signature dynamics [36], and face recognition [8]. Sales of identity verification products exceed $100 million [29]. Face recognition has the benefit of being a passive, non-intrusive system for verifying personal identity. The
Also with the Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742. 1Physiological or behavioral characteristics which uniquely identify us.
techniques used in the best face recognition systems may depend on the application of the system. We can identify at least two broad categories of face recognition systems:
- We want to find a person within a large database of faces (e.g. in a police database). These systemstypically return a list of the most likely people in the database [34]. Often only one image is available per person. It is usually not necessary for recognition to be done in real-time.
- We want to identify particular people in real-time (e.g. in a security monitoring system, locationtracking system, etc.), or we want to allow access to a group of people and deny access to all others (e.g. access to a building, computer, etc.) [8]. Multiple images per person are often available for training and real-time recognition is required.
In this paper, we are primarily interested in the second case[1]. We are interested in recognition with varying facial detail, expression, pose, etc. We do not consider invariance to high degrees of rotation or scaling – we assume that a minimal preprocessing stage is available if required. We are interested in rapid classification and hence we do not assume that time is available for extensive preprocessing and normalization. Good algorithms for locating faces in images can be found in [43, 40, 37].
The remainder of this paper is organized as follows. The data we used is presented in section 2 and related work with this and other databases is discussed in section 3. The components and details of our system are described in sections 4 and 5 respectively. We present and discuss our results in sections 6 and 7. Computational complexity is considered in section 8 and we draw conclusions in section 10.
2 Data
We have used the ORL database which contains a set of faces taken between April 1992 and April 1994 at the Olivetti Research Laboratory in Cambridge, UK[2]. There are 10 different images of 40 distinct subjects. For some of the subjects, the images were taken at different times. There are variations in facial expression (open/closed eyes, smiling/non-smiling), and facial details (glasses/no glasses). All the images were taken against a dark homogeneous background with the subjects in an up-right, frontal position, with tolerance for some tilting and rotation of up to about 20 degrees. There is some variation in scale of up to about 10%. Thumbnails of all of the images are shown in figure 1 and a larger set of images for one subject is shown in figure 2. The images are greyscale with a resolution of .
3 Related Work
3.1 Geometrical Features
Many people have explored geometrical feature based met
剩余内容已隐藏,支付完成后下载完整资料
人脸识别:卷积神经网络方法
Steve Lawrence , C. Lee Giles , Ah Chung Tsoi , Andrew D. Back
NEC Research Institute, 4 Independence Way, Princeton, NJ 08540
Electrical and Computer Engineering, University of Queensland, St. Lucia, Australia
摘要: 面部代表复杂的,多维的,有意义的视觉刺激,并且开发面部识别的计算模型是困难的[43]。我们提出了一种混合神经网络解决方案,与其他方法相比具有优势。该系统结合了局部图像采样,自组织映射神经网络和卷积神经网络。自组织映射将图像样本的量化提供到拓扑空间中,其中原始空间中的附近的输入也在输出空间中附近,从而提供维度降低和图像样本中的微小变化的不变性,以及卷积神经网络提供平移,旋转,缩放和变形的部分不变性。卷积网络在分层的层集中连续提取更大的特征。我们使用Karhunen-Loe`ve变换代替自组织映射来呈现结果,并且使用多层感知器代替卷积网络。 Karhunen-Loe`ve变换的表现几乎相同(5.3%误差与3.8%相比)。多层感知器表现非常差(40%误差与3.8%相比)。该方法能够快速分类,仅需要快速,近似的归一化和预处理,并且始终表现出比数据库上的特征脸方法[43]更好的分类性能,认为训练数据库中每人的图像数量从1到1不等。 5.每人5张图像,所提出的方法和特征脸分别导致3.8%和10.5%的误差。当拒绝少至10%的示例时,识别器提供其输出的置信度并且分类误差接近零。我们使用包含40个人的400张图像的数据库,其中包含表达,姿势和面部细节的高度可变性。我们分析计算复杂性并讨论如何将新类添加到训练的识别器中。
引言
在计算机化访问控制中对个人识别的要求导致大家热衷于生物识别。正在研究的生物识别技术包括指纹[4],语音[7],签名动力学[36]和人脸识别[8]。其中,身份验证产品的销售额超过1亿美元[29]。面部识别具有作为用于验证个人身份,防止盗窃的益处。最佳人脸识别系统中使用的技术可能取决于系统的应用。我们可以识别至少两大类人脸识别系统:
⑴.我们希望在大型面部数据库中找到一个人(例如在警察数据库中)。这些系统通常会返回数据库中最有可能的人员列表[34]。通常每人只能使用一张图像。通常不需要实时识别。
⑵.我们希望实时识别特定人员(例如,在安全监控系统,位置跟踪系统等),或者我们希望允许访问一组人并拒绝访问所有其他人(例如访问建筑物) ,电脑等)[8]。每人多个图像通常可用于训练,并且需要实时识别。
在本文中,我们主要对第二种情况感兴趣。我们感兴趣的是通过不同的面部细节,表情,姿势等进行识别。我们可以使用最小的预处理阶段与快速分类,因此我们不假设时间可用于广泛的预处理和标准化。在[43,40,37]中可以找到用于定位图像中的面部的良好算法。本文的其余部分安排如下。我们使用的数据在第2节中介绍,第3节讨论了与此数据库和其他数据库的相关工作。我们系统的组件和细节分别在第4节和第5节中描述。我们在第6节和第7节中介绍和讨论我们的结果。计算复杂性在第8节中考虑,我们在第10节中得出结论。
数据
我们使用了ORL数据库,其中包含1992年4月至1994年4月在英国剑桥的Olivetti研究实验室拍摄的一组面孔。 有40个不同科目的10个不同图像。 对于一些受试者,图像是在不同时间拍摄的。 面部表情(开/闭眼,微笑/不微笑)和面部细节(眼镜/无眼镜)有所不同。 所有图像都是在黑暗的均匀背景下拍摄的,拍摄对象处于右上方的正面位置,对某些倾斜和旋转的容忍度高达约20度。 规模有一些变化,高达约10%。 所有图像的缩略图如图1所示,一个主题的一组较大图像如图2所示。图像为灰度,分辨率为。
相关工作
几何特征
许多人已经探索了基于几何特征的人脸识别方法。 Kanade [17]提出了一种基于距离比率的自动特征提取方法,并报告了45-75%的识别率与20人的数据库。
图1 ORL人脸数据库,40个对象各有10个图像
Brunelli和Poggio [6]计算了一组几何特征,如鼻子宽度和长度,嘴巴位置和下巴形状。 他们在47人的数据库中报告了90%的识别率。 但是,简单的模板匹配方案可为同一数据库提供100%的识别成功率。 考克斯等人,[9]最近引入了一种混合距离技术,该技术使用来自总共685个人的95个图像的查询数据库实现了95%的识别率。 每个面部由30个手动提取的距离表示。
图2 每个对象的10个图像集,可以看到相当大的变化
在特征之间采用精确测量的距离将需要自动识别这些点,并且运用于特征定位算法的准确性。 特征点的自动定位的算法不提供高度准确性并且需要相当大的计算能力[41]。
特征
高级识别任务通常采用多个处理阶段建模,如从图像到表面到三维模型再到匹配模型的Marr范例[28]。然而,Turk和Pentland [43]认为,很可能还存在基于低级二维图像处理的识别过程。他们的论点是基于人类面部识别的早期发展和极端快速,以及猴子皮层的生理实验,声称其具有选择性地响应面部的孤立神经元[35]。然而,目前尚不清楚这些实验是否排除了Marr范式的唯一操作。Turk和Pentland [43]提出了一种人脸识别方案,其中人脸图像被投射到原始训练图像集的主要成分上。通过与已知个体的比较对得到的特征脸进行分类。
Turk和Pentland在16个人的数据库中呈现结果,这些人都具有各种头部方向,缩放和照明。他们的图像看起来相同,否则面部表情,面部细节,姿势等几乎没有变化。对于照明,方向和尺度变化,他们的系统分别达到96%,85%和64%的正确分类。基于头部尺寸的估计,将比例重新归一化为特征脸大小。中间的脸部突出,减少了改变发型和背景的负面影响。
在Pentland等人。 [34,33]在大型数据库中报告了良好的结果(从3,000个数据库中识别出200人的95%)。很难得出广泛的结论,因为同一个人的许多图像看起来非常相似,并且数据库具有准确的配准和对齐[30]。在Moghaddam和Pentland [30]中,FERET数据库报告了非常好的结果 - 在对150个正面视图图像进行分类时只犯了一个错误。该系统对头部位置,特征检测以及面部几何形状,平移,光照,对比度,旋转和比例的标准化进行了广泛的预处理。
剩余内容已隐藏,支付完成后下载完整资料
资料编号:[20808],资料为PDF文档或Word文档,PDF文档可免费转换为Word
课题毕业论文、文献综述、任务书、外文翻译、程序设计、图纸设计等资料可联系客服协助查找。