Politecnico di Milano
Harvard University
Data-Shack program 2019
Harvard-Politecnico di Milano Joint Program on Data Science

Intro

We are proposing a two-part research collaboration and innovative "hands-on" education experience involving students and faculty from the Institute for Applied Computational Science (IACS) at Harvard’s John A. Paulson School of Engineering and Applied Sciences (SEAS) and from the Master Courses in Computer Engineering and Design of Communication at Politecnico di Milano.

Eight students, four from Harvard and four from Politecnico di Milano, will join together to solve problems within the data science context. Under the supervision of Harvard and Politecnico di Milano faculty, this activity will give students the opportunity to work collaboratively with real world application.

Problems will cross the disciplines of data management, machine learning, data analysis, statistics and mathematics, data visualization and user experience design. Students will craft their solutions by developing the methodology, software, visualization, and high-performance elements, testing, completing the solution, and finally producing final papers that may be submitted for publication.

This joint venture will provide an exciting research opportunity for graduate students in data science to apply their work to the problems of society.

Participating institutions

Harvard University

At Harvard University, the Institute for Applied Computational Science (IACS) is the home for students and faculty who are tackling major challenges in science and the world through the use of computational methods. IACS trains graduate students to solve real-world problems and conduct innovative research by using mathematical models, algorithms, systems innovations and statistical tools. Embedded within a large liberal arts research University, IACS serves as the focal point for interdisciplinary collaborations in computational science at Harvard and the Boston area community.

The one-year master of science program, developed by IACS, provides students with a rigorous core curriculum equally balanced between computer science, applied math and statistics. This training, combined with the flexibility to explore elective topics, equips students to solve problems in whatever arena they choose to work in. Project-based courses complement the curriculum and provide students with practical experience in collaborative problem solving.


Politecnico di Milano

The master schools in computer engineering and communication design of Politecnico di Milano share the mission of creating qualified professionals, capable of understanding, monitoring and mastering the pressing needs of a continuously evolving society. Computer engineering and design offer masters programs that produce several hundreds of top-quality graduates yearly.

The two schools are increasingly engaged in promoting interdisciplinary educational experiences with mixed teams of students; the interaction between engineers and designers produces very powerful forms of innovation, thanks to the mix of information technology and sound computational and mathematical foundations from one side, and creativity, design thinking and effective interaction design from the other. Both schools are engaged in new programs and courses centered on data science, which includes data-driven management, analysis, and visualization, with a problem-solving approach to training.

Context

We are flooded with data: currently tons of PBs (petabytes) of public data is electronically accessible, and this volume is growing. The availability of big data is bringing a paradigm shift in understanding public opinions, and planning and growing cities in which advances are becoming more and more data-driven. Government officials, city planners, statisticians, computer scientists, and engineers have begun collaborating to tackle these data science problems.

Scientists, in general, are increasingly recognizing the value of analyzing vast amounts of data to answer many interesting questions, such as:

  • What activities should be planned?
  • Can neighborhoods be classified and characterized automatically?
  • Can events, such as crime or increase in traffic, be predicted ahead of time so further resources are allocated?
  • Can we use other data sources to identify substandard rental conditions?
  • Can we use contextual data to analyze the heartbeat of a city?
  • Can we create databases fast enough to keep up with the stream of data?

All of these challenges require deep knowledge in statistics, machine learning and computer science. To achieve this, it is necessary to a) work in parallel on multiple facets of the problem and b) combine scientists who are specialized in different areas.

While we have been working on multiple projects addressing the questions above, we also believe that this new scientific paradigm requires a different type of scientist, the data scientist, who is multidisciplinary and has multiple skills ready to tackle these types of problems.

IACS has a strong program in data science, including: multiple courses in data science, a master’s program in computational science and engineering, a summer research experience program for undergraduates, hands-on workshops and a symposium on data science topics each January, a seminar series, and more. Politecnico di Milano has recently started an interdisciplinary Master’s program on big data management, and hosts several courses in data sciences, with an emphasis on social analytics and interaction design (see http://urbanscope.polimi.it).

Our international program will involve top-quality students in engineering and design, and will be an exciting, small-scale experiment that will be very beneficial for the participating students. The program will explore new directions for education in data science, by capitalizing on the strengths of the two institutions.

The development of these new methods benefits other fields beyond the scope of this project. Bio-Medicine, Molecular Genetics, Business Intelligence, etc. face similar challenges dealing with the rapidly increasing amount of available data. All of these fields have to build new tools that enable a deeper analysis of the information and that achieve suitable solutions for their scientific problems.

DataShack 2019 Projects


ALPITOUR GROUP

Alpitour was founded in 1947 in Italy as a small travel agent supporting the will to move of a modernizing country, which was creating what is today acknowledged as the “made in Italy miracle”. Preserving in its DNA the Italian legacy for warm and stylish hospitality, it is today an international group with a turnover of 1,4 billion Euro, ranging from tour operating to hotel management, aviation and travel agencies. Despite the change of paradigm happened in tourism by the digital revolution and affecting major tour operators, Alpitour faced the crisis reinforcing its assets and growing in knowledge and specialization.

With a brand portfolio able to serve highly diversified customers by offering from travel packages to “made to measure” experiences, the Alpitour Group is ready to explore the opportunities of big-data society getting closer to travelers and possibly forecasting their future destinations.

The students challenge is to explore the possibility of detecting bottom up trends driving customer decision making process while choosing travel destinations.

The applied approach will focus on analyzing past travels package purchases by Alpitour customers during two years seasonality cycles. These data will be correlated with digital contents about travels, produced both by institutional editors (webzine and blogs) and Alpitour social media community. The data analysis will be used to detect the most influential sources both in digital media and social media communities (influencers) correlated to Alpitour customers behaviors in travel purchasing. The data analytics will be used to design a predictive algorithm focusing on identifying and ranking future travel destinations. A special attention will be put in data visualization and designing the interface tools.

The final goal will be to design a prototype for a predictive tool to be tested by Alpitour for getting insights about travels future destinations relevant to Alpitour customers. A further possible development could be the design of a recommending system which, based on the predictive model, could suggest ideal destination packages to Alpitour social media communities/clusters.

MSL Italy - Publicis Groupe

Founded in 1926, Publicis Groupe is now the 3rd largest communications group in the world. Through a powerful alchemy of creativity and technology, we are driving business transformation across the entire value chain.

The Milano agency, MSL Italy, is one of the earliest examples of a cross-agency integration between data, technology and creativity, and became the reference within Publicis Communications for data science and consumer insights. It deals with international clients – such as Ferrero, PMI and Netflix – for which it designs and shapes their global marketing and comms strategies. Starting from September 2018, within the Power of One logic, MSL Italy will actively take part in the Spine project, which goal is to concatenate the data the group access to: programmatic, social, analytics, audience enhancing its capacity to talk to consumers in the most relevant way.

The project aims at addressing the question of how can one connect all the dots between online interactions, anticipate them, understand virality and thus, enabling the creation of a tailor-made marketing and comms strategy. The applied approach will focus on analyzing historical and current data provided by social media, interactions and information from the web though semantic logic and image recognition, to understand and model the impact of a given content on the Internet and how it would propagate over time and across channels. The students challenge is being able to give a unique id to a post, a user, a link, and then track it throughout the different publicly opened data sources (including social media), with the aim of anticipating which content would be most likely to buzz, spread and how. Moreover the students could explore the possibility to interconnect different types of data such as the cost per post for an influencer, the engagement, the conversions and such metrics which can be tracked to predict what a potential online content would result in.

The end goal would be to model a tool which gives an accurate representation of what could be the impact of a specific online content – according to its typology and the person / type of person who is posting it – and how it could spread throughout different channels and periods. A critical aspect will rely in how creatively/innovatively the data is being visualized and how easily it can be interpreted.

Learning objectives

Our collaborative work and this proposal focus on two key areas:

  1. the development of methods for analyzing social media data
  2. the development of software design, web design and business development

Students will learn how to:

  1. Deal with data: choice, extraction, integration, visualization.
  2. Choose the most appropriate data analysis method.
  3. Build software and computational artifacts that are robust, reliable, and maintainable.
  4. Communicate between different disciplines.
  5. Work and collaborate in international teams.

The application at Harvard will be open to all students in the Graduate School of Arts and Sciences who plan to take the AC 297r Capstone Course; the application at Polimi will be open to 2nd year Master’s students of Computer Science and Communication Design.

Tentative timeline

  1. November, 1st 2018: Applications open simultaneously for students at Harvard and Polimi; students from both countries will apply to their home institution, and then all applications will be reviewed collaboratively between faculty at Harvard and Polimi.
  2. November 20th 2018: Application deadline.
  3. December 1st 2018: Students are notified of the decision.
  4. January 23-30, 2019: Polimi faculty and students travel to Cambridge to kick-off the project in Harvard university.
  5. End of March 2019: Harvard faculty and students travel to Milan.
  6. February - May 2019: Student teams interact to develop their project under the osmosis of the Harvard capstone course, jointly managed with Politecnico Faculty.
  7. May 2019 Final presentation of results take place simultaneously in Milan and Cambridge.

Application, Benefits and obligations

Politecnico di Milano

This program will select four students currently attending the second year of the Master (Laurea Magistrale) of Politecnico di Milano, two from Information Engineering and two from Communication Design.

Applicants should have obtained at least 40 CFU during their first year of Master studies, with an average grade above 25/30.

Applications must be sent by email by November 20th to Laura Caldirola (laura.caldirola@polimi.it).

The application must include as attachments:

  • an up-to-date record of exams, the English proficiency certificate, a CV (including a description of experiences of programming and of group projects)
  • a motivation letter explaining the candidate's interest in this educational opportunity
  • the preference between the two problems.

Students may be interviewed by the selection committee.

Selected students must attend the two full-immersion periods of January at Harvard and of March at Politecnico.

They will autonomously organize their travel to Harvard and their local accomodations; Politecnico will reimburse their expenses based on receipts. In addition to the full-immersion period, they are expected to work throughout the semester by using the format of Harvard's Capstone Course, by interacting with professors of Politecnico and Harvard University.

They are expected to deliver a public presentation of their results at the end of the program. Upon acceptance, they will sign a letter where they indicate that they agree and understand the obligations of the program; failure to comply will result in loosing their right to benefits.

Harvard University

Harvard students must be in a Harvard master’s or PhD program.

The Data-Shack program is being offered as a part of the AC 297r Capstone course, so students signing up for this program must commit to taking this course during the spring semester.

Students are required to be on campus and participate in program activities with peers from Politecnico from January 23 to January 30 (Note that this is prior to the official start of the spring semester).
Students must be available to travel to Milan during the week of spring break,in March . Students must commit to collaborating with peers at Politechnico throughout the entire spring semester and will be expected to deliver a public presentation of their results at the end of the program.

Students are required to be on campus and participate in program activities with peers from Politecnico from January 23 to January 30. (Note that this is prior to the official start of the spring semester).
Students must be available to travel to Milan during the week of spring break, in March. Students must commit to collaborating with peers at Politechnico throughout the entire spring semester and will be expected to deliver a public presentation of their results at the end of the program.

The four students chosen to participate will be required to pay a $150 deposit to secure their spot in this program. IACS will cover airfare, double occupancy hotel and some meals. Students are required to pay for some meals and additional expenses.

A complete application includes submission of the form linked here and emailing your resume as a PDF to Sheila Coveney, IACS Program Manager, at coveney@seas.harvard.edu, before December 1st.

Contacts

POLIMI Faculty


Harvard Faculty


  • David Sondak

    David Sondak

    Professor of Physics, Harvard John A. Paulson School of Engineering and Applied Sciences.
    dsondak@seas.harvard.edu
  • Pavlos Protopapas

    Pavlos Protopapas

    Scientific Program Director and Lecturer, Institute for Applied Computational Science, Harvard John A. Paulson School of Engineering and Applied Sciences.
    pavlos@seas.harvard.edu
  • Hanspeter Pfister

    Hanspeter Pfister

    Professor of Computer Science, Harvard John A. Paulson School of Engineering and Applied Sciences.
    pfister@seas.harvard.edu