【正文】
pyramidbased vision system, but few results are reported with this system. Rao and DurrantWhyte [36] have implemented a Kalman filterbased decentralized tracking system that tracks moving objects with multiple cameras. Miller [31] has integrated a camera and arm for a tracking task where the emphasis is on learning kinematic and control parameters of the system. Weiss et al. [42] also use visual feedback to develop control laws for manipulation. Brown [8] has implemented a gaze control system that links a robotic “head” containing binocular cameras with a servo controller that allows one to maintain a fixed gaze on a moving object. Clark and Ferrier [12] also have implemented a gaze control system for a mobile robot. A variation of the tracking problems is the case of moving cameras. Some of the papers addressing this interesting problem are [9], [15], [44], and [18].The majority of literature on the control problems encountered in motion tracking experiments is concerned with the problem of generating smooth, uptodate trajectories from noisy and delayed outputs from different vision algorithms.Our previous work [4] coped with that problem in a similar way as in [38], using an cy p y filter, which is a form of steadystate Kalman filter. Other approaches can be found in papers by [33], [34], [28], [6]. In the work of Papanikolopoulos et al. [33], [34], visual sensors are used in the feedback loop to perform adaptive robotic visual tracking. Sophisticated control schemes are described which bine a Kalman filter’s estimation and filtering power with an optimal (LQG) controller which putes the robot’s motion. The vision system uses an opticflow putation based on the SSD (sum of squared differences) method which, while time consuming, appears to be accurate enough for the tracking task. Efficient use of windows in the image can improve the performance of this method. The authors have presented good tracking results, as well as stated that the controller is robust enough so the use of more plex (timevarying LQG) methods is not justified. Experimental results with the CMU Direct Drive Arm П show that the methods are quite accurate, robust, and promising.The work of Lee and Kay [28] addresses the problem of uncertainty of cameras in the robot’s coordinate frame. The fact that cameras have to be strictly fixed in robot’s frame might be quite annoying since each time they are (most often incidentally) displaced。We have developed a new framework for puting opticflow robustly using an estimationtheoretic framework [40]. While this work does not specifically use these ideas, we have future plans to try to adapt this algorithm to such a framework.Our method begins with an implementation of the HornSchunck method of puting opticflow [22]. The underlying assumption of this method is the opticflow constraint equation, which assumes image irradiance at time t and t+σt will be the same:If we expand this constraint via a Taylor series expansion, and drop second and higherorder terms, we obtain the form of the constraint we need to pute normal velocity:Where u and U are the velocities in image space, and Ix, Iy, and It are the spatial and temporal derivatives in the image. This constraint limits the velocity field in an image to lie on a straight line in velocity space. The actual velocity cannot be determined directly from this constraint due to the aperture problem, but one can recover the ponent of velocity normal to this constraint line A second, iterative process is usually employed to propagate velocities in image neighborhoods, based upon a variety of smoothness and heuristic constraints. These added neighborhood constraints allow for recovery of the actual velocities u, v in the image. While putationally appealing, this method of determining opticflow has some inherent problems. First, the putation is done on a pixelbypixel basis, creating a large putational demand. Second, the information on optic flow is only available in areas where the gradients defined above exist.We have overe the first of these problems by using the PIPE image processor [26], [7]. The PIPE is a pipelined parallel image processing puter capable of processing 256 x 256 x 8 bit images at frame rate speeds, and it supports the operations necessary for opticflow putation in a pixel parallel method (a typical image operation such as convolution, warping, addition subtraction of images can be done in one cyclel/60 s). The second problem is alleviated by our not needing to know the actual velocities in the image. What we need is the ability to locate and quantify gross image motion robustly. This rules out simple differencing methods which are too prone to noise and will make location of image movement difficult. Hence, a set of normal velocities at strong gradients is adequate for our task, precluding the need to iteratively propagate velocities in the image.A. Computing Normal OpticFlow in RealTimeOur goal is to track a single moving object in real time. We are using two fixed cameras that image the scene and need to report motion in 3D to a robotic arm control program. Each camera is calibrated with the 3D scene, but there is no explicit need to use registered (., scanline coherence) cameras. Our method putes the normal ponent of opticflow for each pixel in each camera image, finds a centurion of motion energy for each image, and then uses triangulation to intersect the backprojected centurions of image motion in each camera. Four processors are used in parallel on the PIPE. The processors are assigned as four per cameratwo each for the calculation of X and Y motion energy centurions in each image. We also use a special processor board (ISMAP) to perform realtime histogram. The steps below correspond to the numbers in Fig. 3.1) The camera images the scene and the image is sent to processing stages in the PIPE.2) The image is smoothed by convolution with a Gaussian mask. The