freepeople性欧美熟妇, 色戒完整版无删减158分钟hd, 无码精品国产vα在线观看DVD, 丰满少妇伦精品无码专区在线观看,艾栗栗与纹身男宾馆3p50分钟,国产AV片在线观看,黑人与美女高潮,18岁女RAPPERDISSSUBS,国产手机在机看影片

正文內(nèi)容

人工智能分析報告-nvidia:使用深層神經(jīng)網(wǎng)絡(luò)的面部性能捕獲facialperformancecapturewithdeepneuralnetworks(專業(yè)版)

2024-09-16 13:17上一頁面

下一頁面
  

【正文】 Facial Performance Capture with Deep Neural Networks Abstract We present a deep learning technique for facial performance cap ture, ., the transfer of video footage into a motion sequence of a 3D mesh representing an actor’s face. Specifically, we build on a Conventional capture pipeline Target vertex positions conventional capture pipeline based on puter vision and multi view video, and use its results to train a deep neural work to produce similar output from a monocular video sequence. Once trained, our work produces highquality results for unseen inputs Training footage Neural work under training Gradients Loss function Predicted with greatly reduced effort pared to the conventional system. In practice, we have found that approximately 10 minutes worth of highquality data is sufficient for training a work that can then automatically process as much footage from video to 3D as needed. This yields major savings in the development of modern narrative driven video games involving digital doubles of actors and poten tially hours of animated dialogue per character. 1 Introduction Bulk of footage (a) Training Trained neural work vertex positions Inferred vertex positions Using digital doubles of human actors is a key ponent in mod ern video games’ strive for realism. Transferring the essence of a character into digital domain has many challenging technical prob lems, but the accurate capture of facial mov ement remains espe cially tricky. Due to humans’ innate sensitivity to the slightest f acial cues, it is difficult to surpass the uncanny valley, where an otherwise believable rendering of a character appears lifeless or otherwise un natural. Various tools are av ailable for building f acial capture pipelines that take video footage into 3D in one form or another, but their accu racy leaves room for improv ement. In practice, highquality results are generally achievable only with significant amount of manual polishing of output data. This can be a major cost in a large video game production. Furthermore, the animators doing the fixing need to be particularly skilled, or otherwise this editing may introduce distracting, unnatural motion. In this paper, we introduce a neural work based solution to f acial performance capture. Our goal is not to remove the need for man ual work entirely, but to dramatically reduce the extent to which it is required. In our approach, a conventional capture pipeline needs to be applied only to a small subset of input footage, in order to generate enough data for training a neural work. The bulk of the footage can then be processed using the trained work, skip ping the conventional laborintensive capture pipeline entirely. Our approach is outlined in Figure 1. Problem statement We assume that the input for the capture pipeline is one or more video streams of the actor’s head captured under controlled con ditions. The positions of the cameras remain fixed, the lighting and background are standardized, and the actor is to remain at ap proximately the same position relative to the cameras throughout all NVIDIA Technical Report NVR20xx004, September 20xx. Oc 20xx NVIDIA Corporation. All rights reserved. (b) Inference Figure 1: The goal of our system is to reduce the amount of footage that needs to be processed using a conventional, laborintensive capture pipeline. (a) In order to train a neural work, all training foot age must be processed using a conventional capture pipeline. This provides the input/target pairs that are necessary for the work to learn how to perform the mapping from video footage to vert ex positions. According to our experiments, approximately 10 minutes worth of training material is sufficient per actor. (b) For processing the bulk of the material, the conventional capture pipeline can be skipped. Depending on the amount of material, this can yield significant cost savings in productio n. shots. Naturally, some amount of mov ement needs to be allowed, and we achieve this through input data augmentation in the training phase (Section ). In our case, the outputs of the capture pipeline are the perframe positions of the control vertices of a facial mesh, as illustrated in Figure 2. There are various other ways to encode the facial expres sion, including rig parameters or blend shape weights. In the system where our work was developed in, those kind of encodings are introduced in later stages, mainly for pression and rendering purposes, but the primary capture output consists of the positions of approximately 5000 animated vertices on a fixedtopology facial mesh. Existing capture pipeline at Remedy Target data necessary for training the neural work was generated using Remedy Entertainment’s existing capture pipeline based on the mercial DI4D PRO system [Dimensional Imaging 20xx] that employs nine video cameras. The benefit of this system is that it captures the nuanced interactions of the skull, muscles, fascia Input video frame Output Figure 2: Input for the conventional capture pipeline is a set of nine images, whereas our work only uses a cropped portion of the center camera image converted to grayscale. Output of both the conventional capture pipeline and our work consists of the 3D positions of ~5000 animated control vertices for each frame. and skin of an actor so as to bypass plex and expensive f acial rigging and tissue simulations for digital doubles. First an unstructured mesh with texture and optical flow data is cre ated from the images for each frame of a facial performance. A fixedtopology temp
點擊復制文檔內(nèi)容
公司管理相關(guān)推薦
文庫吧 www.dybbs8.com
備案圖鄂ICP備17016276號-1