PhD Research


Motivation for research in this area may be provided on two levels. Early investigations into vision attempted to break down biological systems into their constituent parts and hence build a model for the process as a whole. This approach broke down due to the complexity of the systems and the lack of understanding of the information processing tasks being performed; endeavouring to understand a particular implementation (the human visual system) without understanding the underlying principles involved. The insight offered by Marr and his contemporaries was the suggestion that: "One cannot understand what seeing is and how it works unless one understands the underlying information processing tasks being solved." [Marr 1982]. Thus, the first motivation to study computer vision is to gain an insight into the fundamentals of information processing in vision. This in turn provides insights into questions about the human visual system and the portions of the brain that perform the corresponding interpretations.

Computational Theory What is the goal of the computation, why is it appropriate and what is the logic of the strategy by which it can be carried out?
Representation and Algorithm How can this computational theory be implemented? In particular what is the representation for the input and the output and what is the algorithm for the transformation?
Hardware Implementation How can the representation and algorithm be realised physically?

Table 1: The three levels at which any machine carrying out an information processing task must be understood. Taken from Figure 1-4, [Marr 1982]

Marr proposed three distinct levels, given in Table 1, upon which an information processing task must be understood. Thus we may conclude that the study of the computational theory, representation and algorithms involved in vision gives us an insight into its challenges and allows us to obtain a deeper understanding whether it be in the context of human or machine vision.


Besides intellectual curiosity, from the viewpoint of an engineer there is a strong motivation for understanding vision in order to create technologies which may useful in their own right. Currently there is a great demand for 3D models of objects in the world; in particular we are noticing the appearance of new 3D display technologies which will create the visualisations and interfaces of the future and change the way we are able to access and interpret information held on computers. Computer vision offers the possibility of providing this 3D information for many applications that in turn provide their own motivation. The most significant current and future applications of 3D model acquisition include:

  • Digital archiving and archaeology.
    The ability to generate 3D models from photos alone is highly attractive to museums and other archival institutions. Photographs are routinely taken for documentation in museums so for very little extra expense we may obtain high resolution models. These models allow interested parties to view pieces from any angle, in an interactive setting, from their own computers and allow museums to provide displays for all the pieces in their collections, not just the ones they have space to display. This is also particularly important for items which must be preserved in a restrictive environment that makes physical display or viewing difficult. Figure 1 shows an example of a model produced for the Victoria and Albert Museum.
  • Medical imaging.
    There have been many advances in medical imaging technologies for studying the internal features of the body but there are situations where a cheap method for creating models of the external body could be put to good use. For instance we can remove the need to take plaster casts in order to generate masks for radiotherapy or a brace for orthopaedics by obtaining a 3D model of part of the body from a small number of photographs. Figure 2 demonstrates a simple example: a model of a hand obtained using nothing more that a camera and a newspaper.
  • Entertainment, communication and the media.
    The entertainment industries are a huge source of demand for 3D data for film and television or providing interactive experiences in training simulations and games. This demand will only increase with the prevalence of new 3D display technologies that are already available in cinemas and in the home. As the ease of communication found with the internet is encouraging global collaboration, we face a greater demand for technologies for communication to reduce the need for travel and allow people to collaborate in an interactive setting. Systems to capture 3D content in real time will greatly increase the quality of this experience and allow the technology to blend into the background making communication more natural. We might also like to create our own 3D content in the home: Figure 3 shows a model of a sculpture, which was obtained from 8 photos taken with a compact digital camera. The processing was performed automatically and required no technical knowledge on the part of the user.
  • Engineering simulation and analysis.
    The ability to capture accurate 3D models is very useful internally to the scientific and engineering communities. An example would be structural analysis of buildings, providing verification of design and looking for wear and fatigue. The ability to perform accurate physical simulations during modern design processes creates a demand for models of existing infrastructure to improve the designs of the future, for example fluid flows around aircraft or buildings, or to assess and protect against earthquakes and hurricanes.

A common factor of these applications is the desire to reconstruct models from photos taken in the 'real world', away from the controlled conditions of the laboratory, and the recognition that the end users of these technologies are specialists in their own fields and not experts in computer vision. It is the desire to study vision whilst also making the outputs of this research useful to and usable by the people who may benefit from these technologies that motivates the contributions of my research.


[Marr 1982]   D. Marr.   Vision.   W.H.Freeman & Co., 1982.