The Process of Virtual Life

simulationprocess

In my last two blog entries I have shown you the result of creating virtual denizens based on actual census data. I feel compelled to give a little bit more details on the simulation process as the generation of virtual denizens is only a small cog in the bigger picture of that simulation.

Let us start with the beginning, as the census data is based on people living address, we want to generate random locations in an area of interest where the home of the virtual denizens will be located. In our case we are generating locations around the greater Montréal area. First step is easy enough, we generate uniformly distributed latitude and longitude in the area of interest. However, peoples are not evenly distributed, so we need a way to achieve proper population density distribution. To achieve that we first query the Google API in order to obtain the postal code associated to the random location (this also has the side effect of removing most of the locations that are inside water body… we will not complain!). We still need a second transformation for the location information in order to obtain the census tract (the smallest area for which census information is available and corresponding to approximately 5000 peoples). That transformation can be achieved via a web page query provided by the Canadian census bureau. We can now look up the census information and keep random locations according to the actual population density, dropping extra locations. From that process we generated about 11000 virtual denizens home locations. As we need to drop a good proportion of the generated random locations and as we are throttled by the Google API in our query rate, this process alone can take many days.

Next step if what you saw the result of. From the census data, we generate a random Denizen for each of the retained home locations. The Canadian census data is complemented with the Québec statistical bureau for the age pyramid information. The overall picture is decorated with information obtained from a data bank providing the most popular first names and a similar one providing the most popular last names. Again, the result looks like the following:

Maxim Lavoie a Female of 27 year old, born in Quebec.

  • Residence is located around coordinates: [45.4872,-73.4226], postal code: J3Y 4Z1.
  • Phone number: +15146589535
  • Attended No certificate, diploma or degree.
  • Is currently working full-time on a rotating or other shift in Health occupations for the Health care and social assistance industry and goes to an usual place for work.
  • Usually work/attend school on: Mon Tue Wed Thu Fri from 05h30 to 12h45 (including commute).
  • Has an income of $60,000 to $79,999.
  • Has a regular activity/hobby performed around coordinates: [45.4925,-73.4375] on: Mon Wed Thu from 15h00 to 16h00 (including commute).
  • Usually move around by means of Car, truck or van – as a driver for a commute distance of 10.3 km to a location around coordinates: [45.4760,-73.3308].
  • Usually sleep from 20h00 to 04h00.

Following that we have the basic statistical information defining a denizen, we now need to simulate their daily life. Google API now become a major bottleneck as the throttling of direction information between two points would make our simulation take too much time. So in order to remove that bottleneck we created from Google Map two low fidelity transit maps: one for cars (mainly highways and major roads) and one for public transit such as subway and trains. These Maps, combined with a shortest path finding algorithm enables us to simulate virtual denizen’s location with sufficient fidelity. For example, our virtual denizens will follow a straight path to the closest entry point on the low fidelity highway map and will then proceed along that path to the exit point, the closest point to their destination. From there they will complete their travel in straight line to their final destination. Walking and Bicycling will follow straight lines as approximation as they are usually shorter travels. Finally speed on and off highways/public transit network is dependant on the mode of transportation and the time of the day to simulate rush hours.

Fig. 1: Low Fidelity Subway and Train (left) and Highway (right) network.

 

Now we have the virtual denizen’s daily life simulated. We know who they are, where they are at every moment being it be home, work, usual hobby place, groceries/restaurant/… or in transit between those places. This is where I am currently at. My next steps are to use the Transmission Sites data I found on the web site of Innovation, Science and Economic Development Canada to determine which cell site they are currently using based on their location, generate mobile usage behavior based on their current activities and then simulate QoS experienced by the virtual denizens and other mobile network usage information in order to get a proper simulation of mobile usage in a metropolitan area.

For us the journey will not stop there. In parallel we are creating information dashboards and applying machine learning algorithm in order to digest that simulated data and show compelling insights about our virtual denizens. Eventually we hope to demonstrate it with real mobile network data, but let’s take one step at a time!

Advertisements

Virtual Peoples (part two)

I’m back from the holiday vacations and continuing on with the Virtual Peoples generation in order to perform the Virtual People simulation at a later stage. So in this first step I want to generate census data (or other data sources) accurate virtual peoples. This means that analysing a large enough group of these Virtual Peoples, we should get back data equivalent to what we get from the Canadian census of 2011 (as one of the data source).

As mentioned above the first step is to generate Virtual Peoples, this include (for now) the data you can see below for a small generated sample. In the next step, I will simulate those peoples by making them walk/cycle/drive/use public transit around the city to go from home to work or to their regular hobby/activity, but also shop around.

Finally, we track these Virtual Peoples through their simulated mobile phone usage. We will give them rules they will follow based on the demographic information we generated and hopefully retrieve those rules through the analysis of the tracking data we will generate. The rules will dictate their mobile usage behaviour through their different activities: work, home, hobby, shopping, transit, sleep…

The following sample of Virtual Peoples gives you an idea of the level of fidelity we could achieve with the later simulation. And shows some new information I have added to the generation in the last week. I can as well show a little map of the point of interest, as demonstrated for the first few ones, where home is in green, work in blue and hobby place in orange.

Do not try to reach those Virtual Peoples at their virtual addresses or virtual phone numbers, unless you have a virtual phone to reach them since as for last time, any resemblance between the generated virtual peoples in this article and any persons, living or dead, is a prodigal  phenomenon of such a rarity that it would be of supernatural occurrence!

Lydia Cote a Female of 65 year old, born in Africa.
Residence is located around coordinates: [45.5889,-73.5776], postal code: H1S 1M7.
Phone number: +15146123554
Attended No certificate, diploma or degree.
Not in labour force
Has an income of Under $5,000.
Has a regular activity/hobby performed around coordinates: [45.5697,-73.6289] on: Thu Fri from 14h00 to 16h00.
Usually move around by means of Walking.

virtual1

Noah Girard a Male of 65 year old, born in Quebec.
Residence is located around coordinates: [45.7591,-73.7345], postal code: J7M 1H5.
Phone number: +15149673695
Attended No certificate, diploma or degree.
Not in labour force
Has an income of $50,000 to $59,999.
Usually move around by means of Car, truck or van – as a driver.

virtual2

William Gagnon a Male of 45 year old, born in Quebec.
Residence is located around coordinates: [45.7774,-73.4166], postal code: J5Y 3S3.
Phone number: +15144522173
Attended Apprenticeship or trades certificate or diploma.
Not in labour force
Has an income of $80,000 to $99,999.
Usually move around by means of Car, truck or van – as a driver.

virtual3

Hannah Pelletier a Female of 85 year old, born in Quebec.
Residence is located around coordinates: [45.4505,-73.6014], postal code: H4E 2V7.
Phone number: +15147797183
Attended Apprenticeship or trades certificate or diploma.
Is not currently employed but usually work in: Health occupations for the Arts, entertainment and recreation industry.
Has an income of $10,000 to $14,999.
Usually move around by means of Car, truck or van – as a driver for a commute distance of 9.4 km to a location around coordinates: [45.4978,-73.6538].

virtual4.JPG

Erika Morin a Female of 36 year old, born in Quebec.
Residence is located around coordinates: [45.6403,-73.8524], postal code: J7E 3W1.
Phone number: +15149922233
Attended High school diploma or equivalent.
Not in labour force
Has an income of $20,000 to $29,999.
Has a regular activity/hobby performed around coordinates: [45.6379,-73.8470] on: Tue Thu from 06h00 to 08h00.
Usually move around by means of Walking.

Florence Cote a Female of 57 year old, born in Quebec.
Residence is located around coordinates: [45.6000,-73.9162], postal code: J7R 7L4.
Phone number: +15142937123
Attended No certificate, diploma or degree.
Is currently working part-time on a rotating or other shift in Sales and service occupations for the Public administration industry and goes to an usual place for work.
Usually work on: Mon Tue Wed Thu Fri from 07h45 to 13h45.
Has an income of $30,000 to $39,999.
Usually move around by means of Car, truck or van – as a driver for a commute distance of 30.5 km to a location around coordinates: [45.5691,-73.6425].

Olivier Caron a Male of 9 year old, born in Quebec.
Residence is located around coordinates: [45.5932,-73.6993], postal code: H7G 4X7.
Kid at school
Usually move around by means of Car, truck or van – as a passenger for a commute distance of 3.6 km to a location around coordinates: [45.5706,-73.6971].

Eleonore Belanger a Female of 4 year old, born in Quebec.
Residence is located around coordinates: [45.6020,-73.7941], postal code: H7L 3M6.
Kid in daycare
Has a regular activity/hobby performed around coordinates: [45.5815,-73.8078] on: Thu Sat from 12h00 to 14h00.
Usually move around by means of Walking for a commute distance of 33.5 km to a location around coordinates: [45.6306,-73.4921].

Henri Gauthier a Male of 13 year old, born in Quebec.
Residence is located around coordinates: [45.5222,-73.3274], postal code: J3V 2P9.
Phone number: +15149920044
Kid at school
Usually move around by means of Car, truck or van – as a passenger for a commute distance of 12.8 km to a location around coordinates: [45.5466,-73.4384].

Lucas Cote a Male recent immigrant of 62 year old, born in Americas.
Residence is located around coordinates: [45.4948,-73.4176], postal code: J3Y 3J9.
Phone number: +15147784103
Attended No certificate, diploma or degree.
Is currently working full-time on a rotating or other shift in Management occupations for the Arts, entertainment and recreation industry and goes to an usual place for work.
Usually work on: Mon Tue Wed Thu Fri from 18h30 to 06h30.
Has an income of $20,000 to $29,999.
Has a regular activity/hobby performed around coordinates: [45.4987,-73.4206] on: Tue Thu Sat from 20h00 to 22h00.
Usually move around by means of Car, truck or van – as a driver for a commute distance of 6.1 km to a location around coordinates: [45.4612,-73.4459].

Virtual Peoples…

This is my last day of work before Christmas and New Year festivities. Time for one last update on the crazy stuff I am currently doing and also time to wish you and you families all the best for the new year!

My group is currently working on a data science project and as peoples in that field might have experienced, it is sometime hard to get by the data… As probably others have done in the past we decided in a first phase to generate our own data from simulation.

We have to simulate peoples, more precisely in this case, peoples from the Montreal area. We want the simulation to be representative of the reality, thus we start from Canadian census data and generate virtual denizens which are statistically correct for their home location. I’m also generating names from various sources (including tombstones registry). I’m still in the first stages of that simulation, but here is a subset of virtual denizens I have generated for my neighborhood region (Any resemblance between the generated virtual denizens in this article and any persons, living or dead, is a miracle.)

Rita Desjardins a Female Adult of 22 year old, born in Europe.
– Who attended College, CEGEP or other non-university certificate or diploma.
– Is not in the work force.
– Has an income of $5,000 to $9,999.
– Usually move around by means of Car, truck or van – as a passenger.

Youssef Bedard a Male Kid of 3 year old, born in Quebec.
– Is currently attending daycare.
– Usually move around by means of Car, truck or van – as a passenger.

Louis Martel who is not a Canadian citizen.

Lyana Belanger a Female Kid of 16 year old, born in Quebec.
– Is currently attending school.
– Usually move around by means of Car, truck or van – as a passenger.

Sam Roy a Male Adult of 22 year old, born in Quebec.
– Who attended Bachelor’s degree.
– Is currently working full-time in: Business, finance and administration occupations
– For the Educational services industry and Worked at usual place.
– Has an income of $60,000 to $79,999.
– Usually move around by means of Car, truck or van – as a driver.

Matheo Gagnon a Male Adult of 22 year old, born in Asia.
– Who attended College, CEGEP or other non-university certificate or diploma.
– Is currently working full-time in: Trades, transport and equipment operators and related occupations
– For the Manufacturing industry and Worked at usual place.
– Has an income of $60,000 to $79,999.
– Usually move around by means of Car, truck or van – as a driver.

Marc Bouchard a Male Adult recent immigrant of 45 year old, born in Europe.
– Who attended Bachelor’s degree.
– Is not in the work force.
– Has an income of $50,000 to $59,999.
– Usually move around by means of Public transit.

You can expect I’ll publish some portion of the Python Notebook performing this generation somewhere in the new year, if this is of interest to you.

In the mean time I wish you joyful end of year festivities!