【正文】
We have developed a new framework for puting opticflow robustly using an estimationtheoretic framework [40]. While this work does not specifically use these ideas, we have future plans to try to adapt this algorithm to such a framework.Our method begins with an implementation of the HornSchunck method of puting opticflow [22]. The underlying assumption of this method is the opticflow constraint equation, which assumes image irradiance at time t and t+σt will be the same:If we expand this constraint via a Taylor series expansion, and drop second and higherorder terms, we obtain the form of the constraint we need to pute normal velocity:Where u and U are the velocities in image space, and Ix, Iy, and It are the spatial and temporal derivatives in the image. This constraint limits the velocity field in an image to lie on a straight line in velocity space. The actual velocity cannot be determined directly from this constraint due to the aperture problem, but one can recover the ponent of velocity normal to this constraint line A second, iterative process is usually employed to propagate velocities in image neighborhoods, based upon a variety of smoothness and heuristic constraints. These added neighborhood constraints allow for recovery of the actual velocities u, v in the image. While putationally appealing, this method of determining opticflow has some inherent problems. First, the putation is done on a pixelbypixel basis, creating a large putational demand. Second, the information on optic flow is only available in areas where the gradients defined above exist.We have overe the first of these problems by using the PIPE image processor [26], [7]. The PIPE is a pipelined parallel image processing puter capable of processing 256 x 256 x 8 bit images at frame rate speeds, and it supports the operations necessary for opticflow putation in a pixel parallel method (a typical image operation such as convolution, warping, addition subtraction of images can be done in one cyclel/60 s). The second problem is alleviated by our not needing to know the actual velocities in the image. What we need is the ability to locate and quantify gross image motion robustly. This rules out simple differencing methods which are too prone to noise and will make location of image movement difficult. Hence, a set of normal velocities at strong gradients is adequate for our task, precluding the need to iteratively propagate velocities in the image.A. Computing Normal OpticFlow in RealTimeOur goal is to track a single moving object in real time. We are using two fixed cameras that image the scene and need to report motion in 3D to a robotic arm control program. Each camera is calibrated with the 3D scene, but there is no explicit need to use registered (., scanline coherence) cameras. Our method putes the normal ponent of opticflow for each pixel in each camera image, finds a centurion of motion energy for each image, and then uses triangulation to intersect the backprojected centurions of image motion in each camera. Four processors are used in parallel on the PIPE. The processors are assigned as four per cameratwo each for the calculation of X and Y motion energy centurions in each image. We also use a special processor board (ISMAP) to perform realtime histogram. The steps below correspond to the numbers in Fig. 3.1) The camera images the scene and the image is sent to processing stages in the PIPE.2) The image is smoothed by convolution with a Gaussian mask. The convolution operator is a builtin operation in the PIPE and it can be performed in one frame cycle.34) In the next two cycles, two more images are read in, smoothed and buffered, yielding smoothed images Io and I1 and I2. The ability to buffer and pipeline images allows temporal operations on images, albeit at the cost of processing delays (lags) on output. There are now three smoothed images in the PIPE, with the oldest image lagging by 3/60 s.5) Images Io and I2, are subtracted yielding the temporal derivative It. 6) In parallel with step 5, image I1 is convolved 。The ability to track an object in three dimensions implies that there will be motion across the retinas (image planes) that are imaging the scene. By identifying this motion in each camera, we can begin to find the actual 3D motion. and a separate arm control system puter that performs inverse kinematic transformations and jointlevel servicing. Each of these systems has its own sampling rate, noise characteristics, and processing delays, which need to be integrated to achieve smooth and stable realtime performance. In our case, this involves overing visual processing noise and delays with a predictive filter based upon a probabilistic analysis of the system noise characteristics. In addition, realtime arm control needs to be able to operate at fast servo rates regardless of whether new predictions of object position are available. The system consists of two fixed cameras that can image a scene containing a moving object (Fig. 1). A PUMA560 with a parallel jaw gripper attached is used to track and pick up the object as it moves (Fig. 2). The system operates as follows:1) The imaging system performs a stereoscopic opticflow calculation at each pixel in the image. From these opticflow fields, a motion energy profile is obtained that forms the basis for a triangulation that can recover the 3D position of a moving object at video rates.2) The 3D position of the moving object puted by step 1 is initially smoothed to remove sensor noise, and a nonlinear filter is used to recover the correct trajectory parameters which can be used for forward prediction, and the updated position is sent to the trajectoryplanner/armcontrol system.3) The trajectory planner updates the jointlevel servos of the arm via kinematic transform equations. An additional fixedgain filter is used to provide servolevel control in case of missed or delayed municati