Intro: Discuss coming up with the idea, how we’ll tackle it
Getting Your Data: Need data to work with, here’s how to get it:
- Allows you to easily export any of your google data!
- Select none and then select Hangouts (exporting data for one app instead of all of then will speed up the creation time)
- You can send link as email or upload to your drive (the drive is nice because the download link eventually expires)
File Structure: My compressed file size was 447 mb, most filesize is taken up by attached images. This covers x years of chats. Hangouts.json was 55 mb
Overall “conversations” level contains one “conversation” per chat: ID: Ugy5UIsQC_vlBhTb7wJ4AaABAQ type: * STICKY_ONE_TO_ONE (single person) * GROUP (multiple people) participant_data: * id * * gaia_id 109048136476976024804 * * chat_id 109048136476976024804 * fallback_name: Kerry Peterson * participant_type: GAIA events: conversation_id.id = Ugx1xQIDDAz70wGWc9B4AaABAQ sender_id.gaia_id | .chat_id = 117837102170298606335 timestamp: 1372715320276317 chat_message * message_content * * segment * * * type = TEXT | LINE_BREAK | LINK * * * text = message | \n (optional) | link * attachment * * embed_item * * * type: PLUS_PHOTO * * * url * * * media_type: PHOTO
Key Info: Users: * Lets us pull only the conversation we want to generate stats for * My ID: 109048136476976024804 * Jamie’s ID: 110383187481574444180
Possible Interesting Stats: Messages:
- By year 2013: ▇▇▇▇▇▇▇▇▇▇▇▇▇ 2,789 2014: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 4,518 2015: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 5,383 2016: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 5,380 2017: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 4,010 2018: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 10,010 2019: ▇▇▇ 655
- By month
- By time of day
- Common words (stop words removed)
- Related to events (look at calendar?), talk a lot during trips/school/work?
- Number of emoji’s
- Total Number
- Type counts (image, document)
- Total Number
- Most common domain
- Cross verticals (length of message by time of day)
Graphing: * https://github.com/mkaz/termgraph * https://seaborn.pydata.org/examples/wide_data_lineplot.html
Notes: sender | year | month | day | hour | timestamp | message ——————————————————— kerry | 2019 | 08 | 15 | 22 | 232432424 | message content here!