人脸识别,基于人脸部特征信息识别身份的生物识别技术。摄像机、摄像头采集人脸图像或视频流,自动检测、跟踪图像中人脸,做脸部相关技术处理,人脸检测、人脸关键点检测、人脸验证等。《麻省理工科技评论》(MIT Technology Review),2017年全球十大突破性技术榜单,支付宝“刷脸支付”(Paying with Your Face)入围。
人脸识别优势,非强制性(采集方式不容易被察觉,被识别人脸图像可主动获取)、非接触性(用户不需要与设备接触)、并发性(可同时多人脸检测、跟踪、识别)。深度学习前,人脸识别两步骤:高维人工特征提取、降维。传统人脸识别技术基于可见光图像。深度学习+大数据(海量有标注人脸数据)为人脸识别领域主流技术路线。神经网络人脸识别技术,大量样本图像训练识别模型,无需人工选取特征,样本训练过程自行学习,识别准确率可以达到99%。
人脸识别技术流程。
人脸图像采集、检测。人脸图像采集,摄像头把人脸图像采集下来,静态图像、动态图像、不同位置、不同表情。用户在采集设备拍报范围内,采集设置自动搜索并拍摄。人脸检测属于目标检测(object detection)。对要检测目标对象概率统计,得到待检测对象特征,建立目标检测模型。用模型匹配输入图像,输出匹配区域。人脸检测是人脸识别预处理,准确标定人脸在图像的位置大小。人脸图像模式特征丰富,直方图特征、颜色特征、模板特征、结构特征、哈尔特征(Haar-like feature)。人脸检测挑出有用信息,用特征检测人脸。人脸检测算法,模板匹配模型、Adaboost模型,Adaboost模型速度。精度综合性能最好,训练慢、检测快,可达到视频流实时检测效果。
人脸图像预处理。基于人脸检测结果,处理图像,服务特征提取。系统获取人脸图像受到各种条件限制、随机干扰,需缩放、旋转、拉伸、光线补偿、灰度变换、直方图均衡化、规范化、几何校正、过滤、锐化等图像预处理。
人脸图像特征提取。人脸图像信息数字化,人脸图像转变为一串数字(特征向量)。如,眼睛左边、嘴唇右边、鼻子、下巴位置,特征点间欧氏距离、曲率、角度提取出特征分量,相关特征连接成长特征向量。
人脸图像匹配、识别。提取人脸图像特征数据与数据库存储人脸特征模板搜索匹配,根据相似程度对身份信息进行判断,设定阈值,相似度越过阈值,输出匹配结果。确认,一对一(1:1)图像比较,证明“你就是你”,金融核实身份、信息安全领域。辨认,一对多(1:N)图像匹配,“N人中找你”,视频流,人走进识别范围就完成识别,安防领域。
人脸识别分类。
人脸检测。检测、定位图片人脸,返回高业饿呀人脸框坐标。对人脸分析、处理的第一步。“滑动窗口”,选择图像矩形区域作滑动窗口,窗口中提取特征对图像区域描述,根据特征描述判断窗口是否人脸。不断遍历需要观察窗口。
人脸关键点检测。定位、返回人脸五官、轮廓关键点坐标位置。人脸轮廓、眼睛、眉毛、嘴唇、鼻子轮廓。Face++提供高达106点关键点。人脸关键点定位技术,级联形回归(cascaded shape regression, CSR)。人脸识别,基于DeepID网络结构。DeepID网络结构类似卷积神经网络结构,倒数第二层,有DeepID层,与卷积层4、最大池化层3相连,卷积神经网络层数越高视野域越大,既考虑局部特征,又考虑全局特征。输入层 31x39x1、卷积层1 28x36x20(卷积核4x4x1)、最大池化层1 12x18x20(过滤器2x2)、卷积层2 12x16x20(卷积核3x3x20)、最大池化层2 6x8x40(过滤器2x2)、卷积层3 4x6x60(卷积核3x3x40)、最大池化层2 2x3x60(过滤器2x2)、卷积层4 2x2x80(卷积核2x2x60)、DeepID层 1x160、全连接层 Softmax。《Deep Learning Face Representation from Predicting 10000 Classes》 。
人脸验证。分析两张人脸同一人可能性大小。输入两张人脸,得到置信度分类、相应阈值,评估相似度。
人脸属性检测。人脸属性辩识、人脸情绪分析。 在线人脸识别测试。给出人年龄、是否有胡子、情绪(高兴、正常、生气、愤怒)、性别、是否带眼镜、肤色。
人脸识别应用,美图秀秀美颜应用、世纪佳缘查看潜在配偶“面相”相似度,支付领域“刷脸支付”,安防领域“人脸鉴权”。Face++、商汤科技,提供人脸识别SDK。
人脸检测。 。
Florian Schroff、Dmitry Kalenichenko、James Philbin论文《FaceNet: A Unified Embedding for Face Recognition and Clustering》 。 。
LFW(Labeled Faces in the Wild Home)数据集。 。美国马萨诸塞大学阿姆斯特分校计算机视觉实验室整理。13233张图片,5749人。4096人只有一张图片,1680人多于一张。每张图片尺寸250x250。人脸图片在每个人物名字文件夹下。
数据预处理。校准代码 。 检测所用数据集校准为和预训练模型所用数据集大小一致。 设置环境变量
export PYTHONPATH=[...]/facenet/src
校准命令
for N in {1..4}; do python src/align/align_dataset_mtcnn.py ~/datasets/lfw/raw ~/datasets/lfw/lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction 0.25 & done
预训练模型20170216-091149.zip 。 训练集 MS-Celeb-1M数据集 。微软人脸识别数据库,名人榜选择前100万名人,搜索引擎采集每个名人100张人脸图片。预训练模型准确率0.993+-0.004。
检测。python src/validate_on_lfw.py datasets/lfw/lfw_mtcnnpy_160 models 基准比较,采用facenet/data/pairs.txt,官方随机生成数据,匹配和不匹配人名和图片编号。
十折交叉验证(10-fold cross validation),精度测试方法。数据集分成10份,轮流将其中9份做训练集,1份做测试保,10次结果均值作算法精度估计。一般需要多次10折交叉验证求均值。
from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_functionimport tensorflow as tfimport numpy as npimport argparseimport facenetimport lfwimport osimport sysimport mathfrom sklearn import metricsfrom scipy.optimize import brentqfrom scipy import interpolatedef main(args): with tf.Graph().as_default(): with tf.Session() as sess: # Read the file containing the pairs used for testing # 1. 读入之前的pairs.txt文件 # 读入后如[['Abel_Pacheco','1','4']] pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs)) # Get the paths for the corresponding images # 获取文件路径和是否匹配关系对 paths, actual_issame = lfw.get_paths(os.path.expanduser(args.lfw_dir), pairs, args.lfw_file_ext) # Load the model # 2. 加载模型 facenet.load_model(args.model) # Get input and output tensors # 获取输入输出张量 images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0") embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0") phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0") #image_size = images_placeholder.get_shape()[1] # For some reason this doesn't work for frozen graphs image_size = args.image_size embedding_size = embeddings.get_shape()[1] # Run forward pass to calculate embeddings # 3. 使用前向传播验证 print('Runnning forward pass on LFW images') batch_size = args.lfw_batch_size nrof_images = len(paths) nrof_batches = int(math.ceil(1.0*nrof_images / batch_size)) # 总共批次数 emb_array = np.zeros((nrof_images, embedding_size)) for i in range(nrof_batches): start_index = i*batch_size end_index = min((i+1)*batch_size, nrof_images) paths_batch = paths[start_index:end_index] images = facenet.load_data(paths_batch, False, False, image_size) feed_dict = { images_placeholder:images, phase_train_placeholder:False } emb_array[start_index:end_index,:] = sess.run(embeddings, feed_dict=feed_dict) # 4. 计算准确率、验证率,十折交叉验证方法 tpr, fpr, accuracy, val, val_std, far = lfw.evaluate(emb_array, actual_issame, nrof_folds=args.lfw_nrof_folds) print('Accuracy: %1.3f+-%1.3f' % (np.mean(accuracy), np.std(accuracy))) print('Validation rate: %2.5f+-%2.5f @ FAR=%2.5f' % (val, val_std, far)) # 得到auc值 auc = metrics.auc(fpr, tpr) print('Area Under Curve (AUC): %1.3f' % auc) # 1得到错误率(eer) eer = brentq(lambda x: 1. - x - interpolate.interp1d(fpr, tpr)(x), 0., 1.) print('Equal Error Rate (EER): %1.3f' % eer) def parse_arguments(argv): parser = argparse.ArgumentParser() parser.add_argument('lfw_dir', type=str, help='Path to the data directory containing aligned LFW face patches.') parser.add_argument('--lfw_batch_size', type=int, help='Number of images to process in a batch in the LFW test set.', default=100) parser.add_argument('model', type=str, help='Could be either a directory containing the meta_file and ckpt_file or a model protobuf (.pb) file') parser.add_argument('--image_size', type=int, help='Image size (height, width) in pixels.', default=160) parser.add_argument('--lfw_pairs', type=str, help='The file containing the pairs to use for validation.', default='data/pairs.txt') parser.add_argument('--lfw_file_ext', type=str, help='The file extension for the LFW dataset.', default='png', choices=['jpg', 'png']) parser.add_argument('--lfw_nrof_folds', type=int, help='Number of folds to use for cross validation. Mainly used for testing.', default=10) return parser.parse_args(argv)if __name__ == '__main__': main(parse_arguments(sys.argv[1:]))
性别、年龄识别。 。
Adience 数据集。 。26580张图片,2284类,年龄范围8个区段(0~2、4~6、8~13、15~20、25~32、38~43、48~53、60~),含有噪声、姿势、光照变化。aligned # 经过剪裁对齐数据,faces # 原始数据。fold_0_data.txt至fold_4_data.txt 全部数据标记。fold_frontal_0_data.txt至fold_frontal_4_data.txt 仅用近似正面姿态面部标记。数据结构 user_id 用户Flickr帐户ID、original_image 图片文件名、face_id 人标识符、age、gender、x、y、dx、dy 人脸边框、tilt_ang 切斜角度、fiducial_yaw_angle 基准偏移角度、fiducial_score 基准分数。
数据预处理。脚本把数据处理成TFRecords格式。 。 图片列表 Adience 数据集处理TFRecords文件。图片处理为大小256x256 JPEG编码RGB图像。tf.python_io.TFRecordWriter写入TFRecords文件,输出文件output_file。
构建模型。年龄、性别训练模型,Gil Levi、Tal Hassner论文《Age and Gender Classification Using Convolutional Neural Networks》 。模型 。tenforflow.contrib.slim。
from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_functionfrom datetime import datetimeimport timeimport osimport numpy as npimport tensorflow as tffrom data import distorted_inputsimport refrom tensorflow.contrib.layers import *from tensorflow.contrib.slim.python.slim.nets.inception_v3 import inception_v3_baseTOWER_NAME = 'tower'def select_model(name): if name.startswith('inception'): print('selected (fine-tuning) inception model') return inception_v3 elif name == 'bn': print('selected batch norm model') return levi_hassner_bn print('selected default model') return levi_hassnerdef get_checkpoint(checkpoint_path, requested_step=None, basename='checkpoint'): if requested_step is not None: model_checkpoint_path = '%s/%s-%s' % (checkpoint_path, basename, requested_step) if os.path.exists(model_checkpoint_path) is None: print('No checkpoint file found at [%s]' % checkpoint_path) exit(-1) print(model_checkpoint_path) print(model_checkpoint_path) return model_checkpoint_path, requested_step ckpt = tf.train.get_checkpoint_state(checkpoint_path) if ckpt and ckpt.model_checkpoint_path: # Restore checkpoint as described in top of this program print(ckpt.model_checkpoint_path) global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1] return ckpt.model_checkpoint_path, global_step else: print('No checkpoint file found at [%s]' % checkpoint_path) exit(-1)def _activation_summary(x): tensor_name = re.sub('%s_[0-9]*/' % TOWER_NAME, '', x.op.name) tf.summary.histogram(tensor_name + '/activations', x) tf.summary.scalar(tensor_name + '/sparsity', tf.nn.zero_fraction(x))def inception_v3(nlabels, images, pkeep, is_training): batch_norm_params = { "is_training": is_training, "trainable": True, # Decay for the moving averages. "decay": 0.9997, # Epsilon to prevent 0s in variance. "epsilon": 0.001, # Collection containing the moving mean and moving variance. "variables_collections": { "beta": None, "gamma": None, "moving_mean": ["moving_vars"], "moving_variance": ["moving_vars"], } } weight_decay = 0.00004 stddev=0.1 weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay) with tf.variable_scope("InceptionV3", "InceptionV3", [images]) as scope: with tf.contrib.slim.arg_scope( [tf.contrib.slim.conv2d, tf.contrib.slim.fully_connected], weights_regularizer=weights_regularizer, trainable=True): with tf.contrib.slim.arg_scope( [tf.contrib.slim.conv2d], weights_initializer=tf.truncated_normal_initializer(stddev=stddev), activation_fn=tf.nn.relu, normalizer_fn=batch_norm, normalizer_params=batch_norm_params): net, end_points = inception_v3_base(images, scope=scope) with tf.variable_scope("logits"): shape = net.get_shape() net = avg_pool2d(net, shape[1:3], padding="VALID", scope="pool") net = tf.nn.dropout(net, pkeep, name='droplast') net = flatten(net, scope="flatten") with tf.variable_scope('output') as scope: weights = tf.Variable(tf.truncated_normal([2048, nlabels], mean=0.0, stddev=0.01), name='weights') biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases') output = tf.add(tf.matmul(net, weights), biases, name=scope.name) _activation_summary(output) return outputdef levi_hassner_bn(nlabels, images, pkeep, is_training): batch_norm_params = { "is_training": is_training, "trainable": True, # Decay for the moving averages. "decay": 0.9997, # Epsilon to prevent 0s in variance. "epsilon": 0.001, # Collection containing the moving mean and moving variance. "variables_collections": { "beta": None, "gamma": None, "moving_mean": ["moving_vars"], "moving_variance": ["moving_vars"], } } weight_decay = 0.0005 weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay) with tf.variable_scope("LeviHassnerBN", "LeviHassnerBN", [images]) as scope: with tf.contrib.slim.arg_scope( [convolution2d, fully_connected], weights_regularizer=weights_regularizer, biases_initializer=tf.constant_initializer(1.), weights_initializer=tf.random_normal_initializer(stddev=0.005), trainable=True): with tf.contrib.slim.arg_scope( [convolution2d], weights_initializer=tf.random_normal_initializer(stddev=0.01), normalizer_fn=batch_norm, normalizer_params=batch_norm_params): conv1 = convolution2d(images, 96, [7,7], [4, 4], padding='VALID', biases_initializer=tf.constant_initializer(0.), scope='conv1') pool1 = max_pool2d(conv1, 3, 2, padding='VALID', scope='pool1') conv2 = convolution2d(pool1, 256, [5, 5], [1, 1], padding='SAME', scope='conv2') pool2 = max_pool2d(conv2, 3, 2, padding='VALID', scope='pool2') conv3 = convolution2d(pool2, 384, [3, 3], [1, 1], padding='SAME', biases_initializer=tf.constant_initializer(0.), scope='conv3') pool3 = max_pool2d(conv3, 3, 2, padding='VALID', scope='pool3') # can use tf.contrib.layer.flatten flat = tf.reshape(pool3, [-1, 384*6*6], name='reshape') full1 = fully_connected(flat, 512, scope='full1') drop1 = tf.nn.dropout(full1, pkeep, name='drop1') full2 = fully_connected(drop1, 512, scope='full2') drop2 = tf.nn.dropout(full2, pkeep, name='drop2') with tf.variable_scope('output') as scope: weights = tf.Variable(tf.random_normal([512, nlabels], mean=0.0, stddev=0.01), name='weights') biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases') output = tf.add(tf.matmul(drop2, weights), biases, name=scope.name) return outputdef levi_hassner(nlabels, images, pkeep, is_training): weight_decay = 0.0005 weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay) with tf.variable_scope("LeviHassner", "LeviHassner", [images]) as scope: with tf.contrib.slim.arg_scope( [convolution2d, fully_connected], weights_regularizer=weights_regularizer, biases_initializer=tf.constant_initializer(1.), weights_initializer=tf.random_normal_initializer(stddev=0.005), trainable=True): with tf.contrib.slim.arg_scope( [convolution2d], weights_initializer=tf.random_normal_initializer(stddev=0.01)): conv1 = convolution2d(images, 96, [7,7], [4, 4], padding='VALID', biases_initializer=tf.constant_initializer(0.), scope='conv1') pool1 = max_pool2d(conv1, 3, 2, padding='VALID', scope='pool1') norm1 = tf.nn.local_response_normalization(pool1, 5, alpha=0.0001, beta=0.75, name='norm1') conv2 = convolution2d(norm1, 256, [5, 5], [1, 1], padding='SAME', scope='conv2') pool2 = max_pool2d(conv2, 3, 2, padding='VALID', scope='pool2') norm2 = tf.nn.local_response_normalization(pool2, 5, alpha=0.0001, beta=0.75, name='norm2') conv3 = convolution2d(norm2, 384, [3, 3], [1, 1], biases_initializer=tf.constant_initializer(0.), padding='SAME', scope='conv3') pool3 = max_pool2d(conv3, 3, 2, padding='VALID', scope='pool3') flat = tf.reshape(pool3, [-1, 384*6*6], name='reshape') full1 = fully_connected(flat, 512, scope='full1') drop1 = tf.nn.dropout(full1, pkeep, name='drop1') full2 = fully_connected(drop1, 512, scope='full2') drop2 = tf.nn.dropout(full2, pkeep, name='drop2') with tf.variable_scope('output') as scope: weights = tf.Variable(tf.random_normal([512, nlabels], mean=0.0, stddev=0.01), name='weights') biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases') output = tf.add(tf.matmul(drop2, weights), biases, name=scope.name) return output
训练模型。 。
from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_functionfrom six.moves import xrangefrom datetime import datetimeimport timeimport osimport numpy as npimport tensorflow as tffrom data import distorted_inputsfrom model import select_modelimport jsonimport reLAMBDA = 0.01MOM = 0.9tf.app.flags.DEFINE_string('pre_checkpoint_path', '', """If specified, restore this pretrained model """ """before beginning any training.""")tf.app.flags.DEFINE_string('train_dir', '/home/dpressel/dev/work/AgeGenderDeepLearning/Folds/tf/test_fold_is_0', 'Training directory')tf.app.flags.DEFINE_boolean('log_device_placement', False, """Whether to log device placement.""")tf.app.flags.DEFINE_integer('num_preprocess_threads', 4, 'Number of preprocessing threads')tf.app.flags.DEFINE_string('optim', 'Momentum', 'Optimizer')tf.app.flags.DEFINE_integer('image_size', 227, 'Image size')tf.app.flags.DEFINE_float('eta', 0.01, 'Learning rate')tf.app.flags.DEFINE_float('pdrop', 0., 'Dropout probability')tf.app.flags.DEFINE_integer('max_steps', 40000, 'Number of iterations')tf.app.flags.DEFINE_integer('steps_per_decay', 10000, 'Number of steps before learning rate decay')tf.app.flags.DEFINE_float('eta_decay_rate', 0.1, 'Learning rate decay')tf.app.flags.DEFINE_integer('epochs', -1, 'Number of epochs')tf.app.flags.DEFINE_integer('batch_size', 128, 'Batch size')tf.app.flags.DEFINE_string('checkpoint', 'checkpoint', 'Checkpoint name')tf.app.flags.DEFINE_string('model_type', 'default', 'Type of convnet')tf.app.flags.DEFINE_string('pre_model', '',#'./inception_v3.ckpt', 'checkpoint file')FLAGS = tf.app.flags.FLAGS# Every 5k steps cut learning rate in halfdef exponential_staircase_decay(at_step=10000, decay_rate=0.1): print('decay [%f] every [%d] steps' % (decay_rate, at_step)) def _decay(lr, global_step): return tf.train.exponential_decay(lr, global_step, at_step, decay_rate, staircase=True) return _decaydef optimizer(optim, eta, loss_fn, at_step, decay_rate): global_step = tf.Variable(0, trainable=False) optz = optim if optim == 'Adadelta': optz = lambda lr: tf.train.AdadeltaOptimizer(lr, 0.95, 1e-6) lr_decay_fn = None elif optim == 'Momentum': optz = lambda lr: tf.train.MomentumOptimizer(lr, MOM) lr_decay_fn = exponential_staircase_decay(at_step, decay_rate) return tf.contrib.layers.optimize_loss(loss_fn, global_step, eta, optz, clip_gradients=4., learning_rate_decay_fn=lr_decay_fn)def loss(logits, labels): labels = tf.cast(labels, tf.int32) cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits( logits=logits, labels=labels, name='cross_entropy_per_example') cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy') tf.add_to_collection('losses', cross_entropy_mean) losses = tf.get_collection('losses') regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) total_loss = cross_entropy_mean + LAMBDA * sum(regularization_losses) tf.summary.scalar('tl (raw)', total_loss) #total_loss = tf.add_n(losses + regularization_losses, name='total_loss') loss_averages = tf.train.ExponentialMovingAverage(0.9, name='avg') loss_averages_op = loss_averages.apply(losses + [total_loss]) for l in losses + [total_loss]: tf.summary.scalar(l.op.name + ' (raw)', l) tf.summary.scalar(l.op.name, loss_averages.average(l)) with tf.control_dependencies([loss_averages_op]): total_loss = tf.identity(total_loss) return total_lossdef main(argv=None): with tf.Graph().as_default(): model_fn = select_model(FLAGS.model_type) # Open the metadata file and figure out nlabels, and size of epoch # 打开元数据文件md.json,这个文件是在预处理数据时生成。找出nlabels、epoch大小 input_file = os.path.join(FLAGS.train_dir, 'md.json') print(input_file) with open(input_file, 'r') as f: md = json.load(f) images, labels, _ = distorted_inputs(FLAGS.train_dir, FLAGS.batch_size, FLAGS.image_size, FLAGS.num_preprocess_threads) logits = model_fn(md['nlabels'], images, 1-FLAGS.pdrop, True) total_loss = loss(logits, labels) train_op = optimizer(FLAGS.optim, FLAGS.eta, total_loss, FLAGS.steps_per_decay, FLAGS.eta_decay_rate) saver = tf.train.Saver(tf.global_variables()) summary_op = tf.summary.merge_all() sess = tf.Session(config=tf.ConfigProto( log_device_placement=FLAGS.log_device_placement)) tf.global_variables_initializer().run(session=sess) # This is total hackland, it only works to fine-tune iv3 # 本例可以输入预训练模型Inception V3,可用来微调 Inception V3 if FLAGS.pre_model: inception_variables = tf.get_collection( tf.GraphKeys.VARIABLES, scope="InceptionV3") restorer = tf.train.Saver(inception_variables) restorer.restore(sess, FLAGS.pre_model) if FLAGS.pre_checkpoint_path: if tf.gfile.Exists(FLAGS.pre_checkpoint_path) is True: print('Trying to restore checkpoint from %s' % FLAGS.pre_checkpoint_path) restorer = tf.train.Saver() tf.train.latest_checkpoint(FLAGS.pre_checkpoint_path) print('%s: Pre-trained model restored from %s' % (datetime.now(), FLAGS.pre_checkpoint_path)) # 将ckpt文件存储在run-(pid)目录 run_dir = '%s/run-%d' % (FLAGS.train_dir, os.getpid()) checkpoint_path = '%s/%s' % (run_dir, FLAGS.checkpoint) if tf.gfile.Exists(run_dir) is False: print('Creating %s' % run_dir) tf.gfile.MakeDirs(run_dir) tf.train.write_graph(sess.graph_def, run_dir, 'model.pb', as_text=True) tf.train.start_queue_runners(sess=sess) summary_writer = tf.summary.FileWriter(run_dir, sess.graph) steps_per_train_epoch = int(md['train_counts'] / FLAGS.batch_size) num_steps = FLAGS.max_steps if FLAGS.epochs < 1 else FLAGS.epochs * steps_per_train_epoch print('Requested number of steps [%d]' % num_steps) for step in xrange(num_steps): start_time = time.time() _, loss_value = sess.run([train_op, total_loss]) duration = time.time() - start_time assert not np.isnan(loss_value), 'Model diverged with loss = NaN' # 每10步记录一次摘要文件,保存一个检查点文件 if step % 10 == 0: num_examples_per_step = FLAGS.batch_size examples_per_sec = num_examples_per_step / duration sec_per_batch = float(duration) format_str = ('%s: step %d, loss = %.3f (%.1f examples/sec; %.3f ' 'sec/batch)') print(format_str % (datetime.now(), step, loss_value, examples_per_sec, sec_per_batch)) # Loss only actually evaluated every 100 steps? if step % 100 == 0: summary_str = sess.run(summary_op) summary_writer.add_summary(summary_str, step) if step % 1000 == 0 or (step + 1) == num_steps: saver.save(sess, checkpoint_path, global_step=step)if __name__ == '__main__': tf.app.run()
验证模型。 。
from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_functionfrom datetime import datetimeimport mathimport timefrom data import inputsimport numpy as npimport tensorflow as tffrom model import select_model, get_checkpointfrom utils import *import osimport jsonimport csvRESIZE_FINAL = 227GENDER_LIST =['M','F']AGE_LIST = ['(0, 2)','(4, 6)','(8, 12)','(15, 20)','(25, 32)','(38, 43)','(48, 53)','(60, 100)']MAX_BATCH_SZ = 128tf.app.flags.DEFINE_string('model_dir', '', 'Model directory (where training data lives)')tf.app.flags.DEFINE_string('class_type', 'age', 'Classification type (age|gender)')tf.app.flags.DEFINE_string('device_id', '/cpu:0', 'What processing unit to execute inference on')tf.app.flags.DEFINE_string('filename', '', 'File (Image) or File list (Text/No header TSV) to process')tf.app.flags.DEFINE_string('target', '', 'CSV file containing the filename processed along with best guess and score')tf.app.flags.DEFINE_string('checkpoint', 'checkpoint', 'Checkpoint basename')tf.app.flags.DEFINE_string('model_type', 'default', 'Type of convnet')tf.app.flags.DEFINE_string('requested_step', '', 'Within the model directory, a requested step to restore e.g., 9000')tf.app.flags.DEFINE_boolean('single_look', False, 'single look at the image or multiple crops')tf.app.flags.DEFINE_string('face_detection_model', '', 'Do frontal face detection with model specified')tf.app.flags.DEFINE_string('face_detection_type', 'cascade', 'Face detection model type (yolo_tiny|cascade)')FLAGS = tf.app.flags.FLAGSdef one_of(fname, types): return any([fname.endswith('.' + ty) for ty in types])def resolve_file(fname): if os.path.exists(fname): return fname for suffix in ('.jpg', '.png', '.JPG', '.PNG', '.jpeg'): cand = fname + suffix if os.path.exists(cand): return cand return Nonedef classify_many_single_crop(sess, label_list, softmax_output, coder, images, image_files, writer): try: num_batches = math.ceil(len(image_files) / MAX_BATCH_SZ) pg = ProgressBar(num_batches) for j in range(num_batches): start_offset = j * MAX_BATCH_SZ end_offset = min((j + 1) * MAX_BATCH_SZ, len(image_files)) batch_image_files = image_files[start_offset:end_offset] print(start_offset, end_offset, len(batch_image_files)) image_batch = make_multi_image_batch(batch_image_files, coder) batch_results = sess.run(softmax_output, feed_dict={images:image_batch.eval()}) batch_sz = batch_results.shape[0] for i in range(batch_sz): output_i = batch_results[i] best_i = np.argmax(output_i) best_choice = (label_list[best_i], output_i[best_i]) print('Guess @ 1 %s, prob = %.2f' % best_choice) if writer is not None: f = batch_image_files[i] writer.writerow((f, best_choice[0], '%.2f' % best_choice[1])) pg.update() pg.done() except Exception as e: print(e) print('Failed to run all images')def classify_one_multi_crop(sess, label_list, softmax_output, coder, images, image_file, writer): try: print('Running file %s' % image_file) image_batch = make_multi_crop_batch(image_file, coder) batch_results = sess.run(softmax_output, feed_dict={images:image_batch.eval()}) output = batch_results[0] batch_sz = batch_results.shape[0] for i in range(1, batch_sz): output = output + batch_results[i] output /= batch_sz best = np.argmax(output) # 最可能性能分类 best_choice = (label_list[best], output[best]) print('Guess @ 1 %s, prob = %.2f' % best_choice) nlabels = len(label_list) if nlabels > 2: output[best] = 0 second_best = np.argmax(output) print('Guess @ 2 %s, prob = %.2f' % (label_list[second_best], output[second_best])) if writer is not None: writer.writerow((image_file, best_choice[0], '%.2f' % best_choice[1])) except Exception as e: print(e) print('Failed to run image %s ' % image_file)def list_images(srcfile): with open(srcfile, 'r') as csvfile: delim = ',' if srcfile.endswith('.csv') else '\t' reader = csv.reader(csvfile, delimiter=delim) if srcfile.endswith('.csv') or srcfile.endswith('.tsv'): print('skipping header') _ = next(reader) return [row[0] for row in reader]def main(argv=None): # pylint: disable=unused-argument files = [] if FLAGS.face_detection_model: print('Using face detector (%s) %s' % (FLAGS.face_detection_type, FLAGS.face_detection_model)) face_detect = face_detection_model(FLAGS.face_detection_type, FLAGS.face_detection_model) face_files, rectangles = face_detect.run(FLAGS.filename) print(face_files) files += face_files config = tf.ConfigProto(allow_soft_placement=True) with tf.Session(config=config) as sess: label_list = AGE_LIST if FLAGS.class_type == 'age' else GENDER_LIST nlabels = len(label_list) print('Executing on %s' % FLAGS.device_id) model_fn = select_model(FLAGS.model_type) with tf.device(FLAGS.device_id): images = tf.placeholder(tf.float32, [None, RESIZE_FINAL, RESIZE_FINAL, 3]) logits = model_fn(nlabels, images, 1, False) init = tf.global_variables_initializer() requested_step = FLAGS.requested_step if FLAGS.requested_step else None checkpoint_path = '%s' % (FLAGS.model_dir) model_checkpoint_path, global_step = get_checkpoint(checkpoint_path, requested_step, FLAGS.checkpoint) saver = tf.train.Saver() saver.restore(sess, model_checkpoint_path) softmax_output = tf.nn.softmax(logits) coder = ImageCoder() # Support a batch mode if no face detection model if len(files) == 0: if (os.path.isdir(FLAGS.filename)): for relpath in os.listdir(FLAGS.filename): abspath = os.path.join(FLAGS.filename, relpath) if os.path.isfile(abspath) and any([abspath.endswith('.' + ty) for ty in ('jpg', 'png', 'JPG', 'PNG', 'jpeg')]): print(abspath) files.append(abspath) else: files.append(FLAGS.filename) # If it happens to be a list file, read the list and clobber the files if any([FLAGS.filename.endswith('.' + ty) for ty in ('csv', 'tsv', 'txt')]): files = list_images(FLAGS.filename) writer = None output = None if FLAGS.target: print('Creating output file %s' % FLAGS.target) output = open(FLAGS.target, 'w') writer = csv.writer(output) writer.writerow(('file', 'label', 'score')) image_files = list(filter(lambda x: x is not None, [resolve_file(f) for f in files])) print(image_files) if FLAGS.single_look: classify_many_single_crop(sess, label_list, softmax_output, coder, images, image_files, writer) else: for image_file in image_files: classify_one_multi_crop(sess, label_list, softmax_output, coder, images, image_file, writer) if output is not None: output.close() if __name__ == '__main__': tf.app.run()
微软脸部图片识别性别、年龄网站 。图片识别年龄、性别。根据问题搜索图片。
参考资料: 《TensorFlow技术解析与实战》
欢迎推荐上海机器学习工作机会,我的微信:qingxingfengzi