We present a framework which enables the combination of different mobile devices into one multi-display such that visual content can be shown on a larger area consisting, e.g., of several mobile phones placed arbitrarily on the table. Our system allows the user to perform multi-touch interaction metaphors, even across different devices, and it guarantees the proper synchronization of the individual displays with low latency. Hence from the user’s perspective the heterogeneous collection of mobile devices acts like one single display and input device. From the system perspective the major technical and algorithmic challenges lie in the co-calibration of the individual displays and in the low latency synchronization and communication of user events. For the calibration we estimate the relative positioning of the displays by visual object recognition and an optional manual calibration step.
We present a semi-interactive system for advanced video processing and editing. The basic idea is to partially recover planar regions in object space and to exploit this minimal pseudo-3D information in order to make perspectively correct modifications. Typical operations are to increase the quality of a low-resolution video by overlaying high-resolution photos of the same approximately planar object or to add or remove objects by copying them from other video streams and distorting them perspectively according to some planar reference geometry. The necessary user interaction is entirely in 2D and easy to perform even for untrained users. The key to our video processing functionality is a very robust and mostly automatic algorithm for the perspective registration of video frames and photos, which can be used as a very effective video stabilization tool even in the presence of fast and blurred motion. Explicit 3D reconstruction is thus avoided and replaced by image and video rectification. The technique is based on state-of-the-art feature tracking and homography matching. In complicated and ambiguous scenes, user interaction as simple as 2D brush strokes can be used to support the registration. In the stabilized video, he reference plane appears frozen which simplifies segmentation and matte extraction. We demonstrate our system for a number of quite challenging application scenarios such as video enhancement, background replacement, foreground removal and perspectively correct video cut and paste.
We present an interactive system for fragment-based image completion which exploits information about the approximate 3D structure in a scene in order to estimate and apply perspective corrections when copying a source fragment to a target position. Even though implicit 3D information is used, the interaction is strictly 2D which makes the user interface very simple and intuitive. We propose different interaction metaphors in our system for providing 3D information interactively. Our search and matching procedure is done in the Fourier domain and hence it is very fast and it allows us to use large fragments and multiple source images with high resolution while still obtaining interactive response times. Our image completion technique also takes user-specified structure information into account where we generalize the concept of feature curves to arbitrary sets of feature pixels. We demonstrate our technique on a number of difficult completion tasks.