Google Hangout Stats

Intro: Discuss coming up with the idea, how we’ll tackle it

Getting Your Data: Need data to work with, here’s how to get it:

  • https://takeout.google.com/
    • Allows you to easily export any of your google data!
      • Select none and then select Hangouts (exporting data for one app instead of all of then will speed up the creation time)
      • You can send link as email or upload to your drive (the drive is nice because the download link eventually expires)

File Structure: My compressed file size was 447 mb, most filesize is taken up by attached images. This covers x years of chats. Hangouts.json was 55 mb

Overall “conversations” level contains one “conversation” per chat: ID: Ugy5UIsQC_vlBhTb7wJ4AaABAQ type: * STICKY_ONE_TO_ONE (single person) * GROUP (multiple people) participant_data: * id * * gaia_id 109048136476976024804 * * chat_id 109048136476976024804 * fallback_name: Kerry Peterson * participant_type: GAIA events: conversation_id.id = Ugx1xQIDDAz70wGWc9B4AaABAQ sender_id.gaia_id | .chat_id = 117837102170298606335 timestamp: 1372715320276317 chat_message * message_content * * segment * * * type = TEXT | LINE_BREAK | LINK * * * text = message | \n (optional) | link * attachment * * embed_item * * * type: PLUS_PHOTO * * * url * * * media_type: PHOTO

Key Info: Users: * Lets us pull only the conversation we want to generate stats for * My ID: 109048136476976024804 * Jamie’s ID: 110383187481574444180

Possible Interesting Stats: Messages:

  • By year 2013: ▇▇▇▇▇▇▇▇▇▇▇▇▇ 2,789 2014: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 4,518 2015: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 5,383 2016: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 5,380 2017: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 4,010 2018: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 10,010 2019: ▇▇▇ 655
  • By month
  • By time of day
  • Length
  • Common words (stop words removed)
  • Related to events (look at calendar?), talk a lot during trips/school/work?
  • Number of emoji’s

Attachments:

  • Total Number
  • Type counts (image, document)

Links:

  • Total Number
  • Most common domain

Misc:

  • Cross verticals (length of message by time of day)

Graphing: * https://github.com/mkaz/termgraph * https://seaborn.pydata.org/examples/wide_data_lineplot.html

Notes: sender | year | month | day | hour | timestamp | message ——————————————————— kerry | 2019 | 08 | 15 | 22 | 232432424 | message content here!