Gdago
From jderobot
- Project Name:
- Author: Gonzalo Abella (abellagonzalo [at] gmail [dot] com) & Jose María Cañas (jmplaza [at] gsyc [dot] es)
- Academic Year: 2008-2009
- Degree:
- Jde Version: jde-4.2.1
- SVN Repository: http://svn.jde.gsyc.es/users/gdago/project
- Tags:
- Technology: c, c++, jde suite, openCV
- State: Developing
- Source License:
- Document License:
- Abstract:
This current project develops a visual attention system for a mobile robot that generates representation of 3D environment. For this, the attentive control is based on a map of salience focused on the relevant objects of the image. Through a pair stereo of cameras we analyse the real images, we can such localize relevant objects in the 3D world. The system attentive scheduled analyzes the image in a way similar to what makes a human eye, looking at things that focus to attention: colors, edges, motion or prior knowledge of the objects in reality. Some theories argue that the human visual system has limited capacity of processing, and the attention acts as neuronal filter that selects the information, should be procesed in every moment.
- Other projects: http://svn.jde.gsyc.es/users/gdago/video4linux2
- Documentation:
Master thesis: PDF Presentation: PDF
- Evolution:
Contents |
20090601
Here is the video of a very interesting experiment. We combine 2D attention with 3D attention. To introduce points in the system, we use a color filter. At first, with three points, we use the 'intelligence' of the system to calculate the fourth point and we introduce the square in the 3D memory. After that, we visit the corners of the square to check if they continue there. We also introduce a new point in the image, and the schema it is able to recognize it and insert it in the 3D memory also.
20090521
Here are some videos showing what is able to do our schema.
In this first video, we introduce by 2D attention 3 points in the system. After that we start using our 3D attention system. We visit the three points alternatively. You can see that if one point is lost, after a period of time we don't find it, it is removed from the 3D attention system.
In the second video, we introduce segments unstead of points. By this way, it is much faster to reconstruct a scene. We use a canny filter (edges) and with a hough function we get the segments. You can see the results in the video below.
To finish this collection of videos, we have the cognitive system working with our 3D attention system. We treat the square as a figure, not as just 4 points without any connection.
20090518
The algorithm of visual attention in 3D is finished. In the video below you can see how it works. At first, it introduce 3 points using 2D attention (color filter). With those 3 points, we use a cognitive algorithm to suppose where has to be the 4th point (it is out of the image). At this point it starts to work the 3D attention algorithm. We visit all the points in the memory and if the have saliency yet, we keep them in the memory; if not, we erase them. At the same time, if the point has saliency, we readjust the point.
20090504
We have implemented another optimization. A cognitive system. It detect squares with just three points and it deduce where has to be the fourth point. It just imagine where it should be, but without put it in the scene because it doesn't exist. Here is a demostration video. After calculating three points it will calculate the fourth and the two cameras will look at it.
20090429
We have optimized the pair off pixel method. For this, we have divide the seek in two little searches. In the first one, we look for an area by the vergency method, that is, we make select an area in the three-dimensional space. After this we make a second more specific search. In this second search we use the traditional triangulation method of 3D reconstruction. With this technique we improve quite a lot the results.
As in the previous experiment, the red points are the points calculated by the algorhythm, and the blue points, the points calculated manually.
20090415
For the final experiments I'm measuring the error when clicking on some points. The red points are the results returned by the automatic comparator whereas the blue points are the result of clicking the same pixels in both images.
20090324
Now the 3D reconstrucction can be done semi-automatically. By clicking in a pixel on the left camera, automatically, it will search the equivalent pixel in the other image using a 3D reconstruction algorithm.
20090320
A manual 3D reconstruction is now possible. You have to click some points in both images (the orden is very important, it has to be the same). But the most interesting part is that even if you move the cameras, the 3D reconstruction is the same. In the video you can see the test.
20090313
The movement model is finished. Thanks to it, we can move the cameras freely without losing the calibration. We have implemented two different ways to move the cameras. The first one uses four buttons: up, down, left, right. There is not much to explain xD. In the other way, you introduce a 3D point and press the 'go' button. Both cameras will look at that point. Here is a video example.
20090302
The extrinsics calibrator of the schema is implemented. To calibrate the cameras, first of all, you have to calibrate the intrinsics parameters with JDE's calibrator. We are just goint to use the intrinsics parameters. Here are two snapshots of the process. They are from the left camera and from the right camera, respectivement.
![]() |
| |
|
|
|
|
![]() |
| |
|
|
|
|
Once you have these parameters you have to fill the files with the camera parameters. The name of these file is configurable in the schema configure file and has to follow a structure. After running the schema, you load these camera parameters file with the button load parameters. In the same way you can save any change by clicking on the save parameters button.
In order to calibrate the extrinsics parameters, you have to paint the world in OpenGL. To make sure that the parameters are correct, you can click in the image and a ray will be painted in the openGL world. If the virtual and the real ray pass through the same point, the calibration will be considered valid. Here is an example video:
20090223
Here is a video of how will look like the new graphical interface.
The window is divided in three parts. The top left side is reserved to show the images provided by the camera. Above the main image there is radio button to select which camera image do you want to see, "image A" is the left camera and "image B" is the right camera. Below the main image there is an image selector. You can select the image displayed by clicking on the little images. You can choose between three: the main image, the instant saliency image and the accumulate saliency image.
In the top right side of the window it is displayed the 3D world with OpenGL.
And at the bottom of the window there will be different tabs with different buttons to configure the schema. In the video you can see a slider to configure the ignored side, used to avoid the radial distortion of the image.
20090216
Finally we have not been able to finish the project in this month. We have tried very hard, but it has been impossible. After a pair of meetings we have decided to make some changes:
- First of all we are going to remake the graphical interface to make easier the control of the schema.
- Second, we are going to revise the algorithms to optimize them.
- And third we are goint to improve the functions of the schema. For example, we are studying to rectify the images provided by the camera to avoid the radial distortion.
20090130
Today I have finished the vergency problem. Below you can see an example. First I configure the schema. After clicking in the radio button "vergencia" I clicked on the left image. Automatically I calculate 9 points of the ray. You can see their projection in the right image. In the second label you see the same, but in an OpenGL world: the vergency points and the optical center of the cameras. While I change labels you can see that the real camera is moving and focusing in the points I had calculated. Finally in the "correlation profile" you see the comparation of the patchs. The first seven are very similar because they are all on the box.
If you don't see it very well, you can download it from here.
20090126
Here is an example of the power of the schema. It can be setted the detail level of the 3d map elaborated. In this case it is a the minimum level. It only has been scanned the central area. You can see perfectly the word "isight" from the box.
![]() |
| |
|
|
|
|
![]() |
| |
|
|
|
|
20081216
After some time without updating the page, because I had anything to write about but working hard, I can show you now what I was working on. It is a c++ class to calculate the intersection point in 3D of two par stereo cameras. It is easy to use: After initialization (of course), just select a pixel of one of the cameras and it will return the point in 3D. For the 3D representation uses Progeo library and Opencv for the image processing. Here is a little video calculating the point in 2D. The quality of the video is horrible and it is impossible to appreciate anything.
20081208
This is the aspect of the Pioneer with the par stereo cameras. It does not look bad, but i does not work, yet... xD
![]() |
| |
|
|
|
|
20081204
The new GUI is almost done. You can see all the images and select which one you want to see. The GUI is a too heavy and it won't work if it isn't run in a powerfull pc.
![]() |
| |
|
|
|
|
20081202
Now we can calibrate the cameras in 640x480. I have port Redo's calibrator to use this size. The next step is port it to Gtk.
![]() |
| |
|
|
|
|
20081119
At least we have something cool!! We have a panoramic image where you can see all the world of the camera. It's a first aproximation and the images don't fit very well. In the below image you can see the result.
![]() |
| |
|
|
|
|
20081118
I have inserted the movement space. It's a rectangle that represents the movement space of the camera. It also tells the position in real time.
20081107
I have changed the gui. The little images which are under the main image are, from left to right, the normal image, the rebuilt from logpolar image and the rebuilt from logpolar image of the saliency filter. We have a new functionality: in the main image you see the litte image you click on. And the image at the right edge is the logpolar image filtered.
![]() |
| |
|
|
|
|
20081106
Implemented the image in Log-polar using the libraries made by Jose Maria Cañas a few years ago. Here you can see an image.
![]() |
| |
|
|
|
|
20081027
Project attention created. In this first version you can see the filters. The edge, motion and color filter. This last one, by the moment, is configured to filter just the red color. The camera center automatically the image in the point more interesting, but just in the x axis. When the camera is moving, the image gets freeze.
20081031
Followball project almost finished. I have implemented a followball based in Roberto Calvo's followball. It has been developed for the Sony EVI-D100P camera. This camera is a lot more faster than the Directed's perception pan-tilt unit. To focus the objective we use a HSV filter because it's more tolerant then the RGB filter to the light changes. An autofocus algorithm is implemented but it still has to get optimized. Here we have an example without using the zoom.
- More information:











