Urban Informatics Course

This Wiki page contains information for students participating the joint course module arranged by School of Architecture (artists), Department of Civil Engineering (semi-artists, semi-nerds), and Department of Signal Processing (nerds). The information mainly concerns the technical part of the course.



The goal of this course is three-fold, at first, signal processing students will be familiarised with recent problems in data-driven architecture and urban design and the design and analysis principles in general. Secondly, the architecture and civil engineering students will learn about modern signal processing methods, in particular, audio and video processing, to broaden their knowledge on what kind of tools are available or can be made available to conduct quantitative city behaviour analysis - Urban Informatics. The last important goal is to allow students to innovate their projects, i.e., only the tools are provided, but the students should themselves find the best and the most innovative ways to use those tools!

During the course architecture, civil engineering and signal processing students will form groups that together define a problem, collect data and develop methods to automatically provide "urban information". The students are guided by the tutors from the three departments of Tampere University of Technology: Architecture, Civil Engineering and Signal Processing.


Related work

Your best friend is - as always - Google and you should search for related works and methods that can solve your problem. However, methods that have been published in certain highly appreciated conferences and journals should be preferred over others. In the following we list the best sources for video (image) and audio (our university has free access to all of them).

Video and image processing and analysis

The best starting point is to search for recent papers (not older than 5 years) in the top conferences in the field of computer vision:
  • IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • International Conference on Computer Vision (ICCV)
  • European Conference on Computer Vision (ECCV)
Works presented in the above conferences represent the state-of-the-art and articles are always publicly available, the authors often post their demo videos to Youtube, share their presentation slides and, even better, share their code!

Audio processing and analysis

A good starting point for audio (including speech) related conferences are
  • International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • Annual Conference of the International Speech Communication Association (Interspeech)
  • European signal processing conferences (EUSIPCO)
  • IEEE Transactions on Audio Speech and Language Processing
  • The Journal of the Acoustical Society of America
  • Speech Communication (Elsevier)
  • Computer Speech and Language (Elsevier)
  • The Journal of the Audio Engineering Society (AES)


Online Kampusareena video and audio data


  • Kampusareena:
    • Audio captured at 48kHz sampling rate at 32bit floating point accuracy using 8 Sennheiser MKE-2-PC electret condenser microphones connected to a soundcard (Presonus FP10).
  • Hand-held devices available for loan: Olympus LS-14


We have an IP camera installed in Kampusareena. Contact the supervisors to gain access to the camera stream.

Existing datasets


For lists of available datasets refer to http://www.cs.tut.fi/~heittolt/datasets

For the Kampusareena audio, you need to annotate your audio files. For this purpose, e.g. Audacity (and audio editor) has an annotation capability. E-mail pasi<dot>pertila<at>tut.fi to obtain the dataset from September 8 - November 4, during hours 8:00-17:50, 5 seconds recorded at 10 min intervals.


Video data from public IP-cameras: dataset. The dataset contains video material from a vivid street (night and day) and from a plaza.

Video data from Kampusareena: dataset.

You can also manually download video data from the camera:


General tools



The easiest way to get started with video analysis is to use some of the ready made libraries and toolboxes, such as OpenCV? and Matlab, and build your code top of them. OpenCV? has a very nice documentation and it has interface for C++, C, Python and Java and supports Windows, Linux, Mac OS, iOS and Android. If you are using Matlab, Computer Vision System Toolbox can be helpful.

Specific tools

Face API (Microsoft)

Detect one or more human faces in an image and get back face rectangles for where in the image the faces are, along with face attributes which contain machine learning-based predictions of facial features. After detecting faces, you can take the face rectangle and pass it to the Emotion API to speed up processing. The face attribute features available are: Age, Gender, Pose, Smile, and Facial Hair along with 27 landmarks for each face in the image.

Detect human faces and compare similar ones, organize people into groups according to visual similarity, and identify previously tagged people in images.

Crowd counting and tracking (Mobility - Large scale)

You may use the existing datasets that contain images (video) and ground truth (e.g., the true people count for each frame):

Self-organising audio-visual map (Chain of events in urban space)

Course projects

Project report

The final submission of your project is an article between 6 - 8 pages and which is written using the provided template (doc and Latex templates). The idea in using the template is that you get familiar with scientific work which is valuable for your later career as it enforces you to describe your work in compact and concise form. An example project you may find from the University of California San Diego Statistical Visual Computing Group's Web site of crowd monitoring (the page also contains a pdf of their article in the CVPR conference using the provided template): The final reports will be shared with other groups so that you can read about each others' work. Submit your final submission by email to your instructor. Moreover, for signal processing students a part of the submission is the code that must be committed to the project Subversion repository.

Report. The following is a suggested structure for the report:
  • Title, Author(s) (members of the group)
  • Abstract: It should not be more than 300 words
  • Introduction: this section introduces your problem, and the overall plan for approaching your problem - put efforts to this section to make your work interesting to a reader!
  • Background/Related Work: This section discusses relevant literature for your project both from architecture and signal processing side
  • Approach: This section details the framework of your project. Be specific, which means you might want to include equations, figures, plots and references to work from where you got the idea, etc
  • Experiment: This section begins with what kind of experiments you're doing, what kind of data you're using, and what is the way you measure or evaluate your results. It then shows in details the results of your experiments. By details, we mean both quantitative evaluations (show numbers, figures, tables, etc) as well as qualitative results (show images, example results, etc).
  • Conclusion: What have you learned and how the conducted work supports the goal set in the beginning? Suggest future ideas.
  • References: This is absolutely necessary for adopted methods and sustainable design.
-- JoniKaemaeraeinen - 03 Feb 2016

Topic attachments
I Attachment Action Size Date Who Comment
design_studio1_small.jpgjpg design_studio1_small.jpg manage 76.4 K 10 Sep 2015 - 09:10 JoniKaemaeraeinen  
sgn_ark_course.jpgjpg sgn_ark_course.jpg manage 80.5 K 03 Feb 2016 - 15:20 JoniKaemaeraeinen  
Print version |  PDF  | History: r20 < r19 < r18 < r17 | 
Topic revision: r20 - 15 Dec 2017 - 08:51:59 - PasiPertilae


Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TUTWiki? Send feedback