Center for Automation Research


For millions of years, humans have been perfecting their sensory abilities. Eyesight combined with brain power allows people
to identify faces, recognize objects, perceive their environment, and then take
action based on those observations. At the University of Maryland, a renowned
research and education center is transferring these same human sensory
skills to modern computing platforms. The Center for Automation Research, known as CfAR, is developing computer vision applications that are important in our
lives, including technologies for public safety, health care, education and
entertainment and more. We’ve been working on the problem of computer
vision for over 50 years trying to give computers the ability to see with the
same type of accuracy that people do to be able to recognize everyday objects, everyday actions of people, and the context in which they’re being done. While CfAR has almost two dozen faculty and research scientists in its
four unique labs, it can trace its roots back to the dedicated efforts of one man. The center was established by Azriel Rosenfeld, who was a research professor
here in College Park. And Azriel really was the founder of the whole field of
computer vision. He wrote the first textbook and he started the first
journal and he started the first conference and he probably wrote most of the first few hundred papers. Significant discoveries and computer vision have
already been made at Maryland, yet challenges remain. The Holy Grail here is
to understand what’s in an image, what’s in a video—who is in it, what they’re
doing, when the image was taken, where the image was taken. So we consider all
of this as inference problem: given a single image or a video sequence, how much information can you gather from there? Much of the work in CfAR involves
obtaining information from images or video that have multiple subjects, moving objects, or that may not be in perfect focus. Now when you want to do video-based face recognition, you have to incorporate the motion of the face as
well as the fact that the pose will vary because people don’t just look at the
camera when the video is running and many surveillance cameras are very low
resolution. So you have to deal with all of these
problems together. CfAR works with data from multiple sources, building
software tools that can quickly analyze and process electronic media and
important documents—much of it in languages other than English. In
health care, the center is designing visualization tools that allow medical
professionals to gain new insight into serious conditions like traumatic brain
injury or Parkinson’s disease. Also underway are projects in virtual and
augmented reality. These immersive interactive platforms let people see and
use the information that matters most. To see where important news is happening, check out NewsStand, which can access more than 10,000 news feeds simply by
clicking on a map location. NewsStand provides a map query interface to access
data, such as news and really other data as well that exploits the notion of
spatial synonyms. This means that you don’t have to know what you’re looking
for in your search. The application is available for mobile devices, allowing
users to view news articles, photographs and video clips in multiple languages. The audience for NewsStand is limitless. Anybody who is a news junkie is going to
love being able to get access to things by location. Other interactive tools
developed at CfAR include Leafsnap, where you can take a mobile snapshot of
a leaf and immediately determine its species. There’s also Birdsnap, an
electronic field guide that can identify thousands of species of birds in the
wild. We’re creating new technology that
combines sound waves with light waves to build a 3-D acoustic camera. The device
has many potential applications—from improving security systems to enhancing
concert hall acoustics to protecting industrial workers’ hearing. The center
has a strong interest in autonomous robotics, combining artificial
intelligence, computer vision and language processing software to develop
robots able to think and act on their own. We are one of very few groups in the world that work with robots and language. The best way to interact with robots is
to speak to them and they should answer you back. In order to implement this, the
robot has to have an understanding of the meaning of the words, which becomes very challenging. In addition to language, the robots need to be able to see and
identify objects and clearly understand what it is they’re looking at. One of our
main projects is understanding actions of manipulations. So our work is on
providing tools in order to track hands, but also understand how they are
involved in the action-object relationship in order to predict motion very early on. When these cognitive robots that see and reason and act in the world appropriately, it will change fundamentally our societies. Protecting public spaces, helping diagnose brain injuries, building new computing platforms to interact with media, with nature and with
other people in a digital world. These are just some of the projects underway
in the Center for Automation Research at the University of Maryland. Where is our
research headed next? Come and join us to find out.

Leave a Reply

Your email address will not be published. Required fields are marked *