I’ve chosen to post today on a topic that is becoming more and more important for many people and businesses. When should anyone post on Social Networks and what is the best schedule for posts in order to achieve the right optimized level of responses, feedbacks or clicks.
I found this topic to be interesting for a couple of reasons. The issue is sensible for me because I spend maybe an excessive amount of time posting on social networks; however maybe it is even more sensible for businesses of any size, for institutions of different sort that use these networks, and of course for the Social Media companies themselves, who happen to gather large amounts of data off these daily 24/7-365 (hour-week-days) feeds. That requires of them increasingly better capacity to deal with this huge volume of data and information.
The authors of this paper focus their research on message feeds from Facebook and Twitter; they here have done important effort in understanding the complexity of the problem at hand by examining user behavior in terms of post-reaction times, and comparing cross-network and cross-city weekly reaction behavior across different cities:
For many users on social networks, one of the goals when broadcasting content is to reach a large audience. The probability of receiving reactions to a message differs for each user and depends on various factors, such as location, daily and weekly behavior patterns and the visibility of the message. While previous work has focused on overall network dynamics and message flow cascades, the problem of recommending personalized posting times has remained an underexplored topic of research. In this study, we formulate a when-to-post problem, where the objective is to find the best times for a user to post on social networks in order to maximize the probability of audience responses. To understand the complexity of the problem, we examine user behavior in terms of post-to-reaction times, and compare cross-network and cross-city weekly reaction behavior for users in different cities, on both Twitter and Facebook. We perform this analysis on over a billion posted messages and observed reactions, and propose multiple approaches for generating personalized posting schedules. We empirically assess these schedules on a sampled user set of 0.5 million active users and more than 25 million messages observed over a 56 day period. We show that users see a reaction gain of up to 17% on Facebook and 4% on Twitter when the recommended posting times are used. We open the dataset used in this study, which includes timestamps for over 144 million posts and over 1.1 billion reactions. The personalized schedules derived here are used in a fully deployed production system to recommend posting times for millions of users every day.
This long abstract pretty much describe the effort in this work with some detail, but a closer inspection is in order due the sensible nature of the subject. Analyzing Social Media is becoming ever more important in our time of Big Data Analytics, Machine Learning techniques such as deep learning that analyze text and provide better insights on the data at hand. But a better understanding of the pattern of the timing of posts can only be helpful addition to this tool set, providing another level of how to model user behavior of Social Networks.
We may consider this work helpful also to businesses and processes where the attention of the user is a valuable asset. The proper segmentation of the user base therefore is one of the goals of studies such as this. Another goal concerns how to maximize influence of the messages (posts) sent through the social network:
One of the goals while broadcasting messages is to capture the attention of audience members so that they may react to the posted message. The probability that an audience member reacts to a message may depend on several factors, such as his daily and weekly behavior patterns, his location or timezone, and the volume of other messages competing for his attention. The problem of broadcasting messages at the right time in order to elicit responses from one’s audience is therefore a complex one with many dimensions.
A large body of research in this area has focused on the problem of influence maximization and related topics, where the goal is to target a specific subset of users in order to create information cascades in the network. However, the dynamics of broadcasting to entire audiences, rather than picking specific individuals to target, has been an under-explored topic of study. Further, since each user has a unique audience, any recommendations for posting times need to be personalized to be effective, as we show in this study. We hence formulate a when-to-post problem here, where the objective is to find the best times for a user to post on social networks in order to increase audience responses.
This research was also important in the way it tries to propose a novel approach – something that we are becoming used to when we read this Blog -, in order to understand the nature of the problem faced by a hypothetical agent with interest in solving it, which isn’t difficult to predict a varied set of possible group of agents certainly keen on its resolution:
Apart from introducing the problem, our contributions in this work are three-fold. First, in order to understand the complexity of the when-to-post problem and the factors that affect it, we perform in-depth user reaction behavior analysis, which includes
- Post-to-reaction behavior: We analyze the delays between posting and reaction times across different social networks and user in-degrees.
- Cross-network analysis: We examine the similarities and differences of audience behavior on Twitter and Facebook.
- Cross-city analysis: We compare cycles of daily and weekly user activity in different cities, and present analysis on how location affects posting schedules.
Second, we formally define the when-to-post problem in a probabilistic setting, and propose multiple approaches for recommending personalized posting schedules. Among these are the First-Degree and the Second-Degree schedules, and their corresponding weighted counterparts. We empirically assess these schedules against two global baselines, on a real-world set of 0.5 million active users observed over a 56 day period. We define a metric called Reaction Gain that helps us evaluate the effectiveness of the two approaches, and show that users see an average reaction gain of up to 17% for Facebook and up to 4% for Twitter.
The model and system developed under this research creates a recommendation for scheduled posts on Facebook and Twitter in the order of million events:
Third, we open a public dataset consisting of anonymized user ids and timestamp data that could help future research in this area. This dataset contains timestamps for 144 million posts and 1.1 billion reactions from a 120-day period. We performed our study and analysis on a full production system deployed on klout.com. Klout1 is a social media platform that aggregates and analyzes data from social networks  such as Twitter, Facebook, Google+ and others. Our system recommends personalized posting schedules for millions of users to share content on Twitter and Facebook.
Related Work and Problem Statement
The model for this paper is quite interesting also because the authors claim that their model goes a step further in the modeling of social media events, beyond temporal characteristics, with the perspective of the individual user being part of the input, thereby enabling personalized recommendations for posting messages. The mathematical sophistication is also a feature, going beyond convolution functions and topological data analysis to analyze the information flow through the network structure. Here the approach limits itself to a more personal one-to-one temporal and spacial analysis:
There have been several studies on modeling the dynamics of social network events [12, 15]. For example, the work in  used different convolution functions to analyze the flow of news events and sentiments through Twitter. While the approach of these studies has been to analyze the overall temporal characteristics on social media, here we take the further step of analyzing reaction behavior from the point of view of each individual user, thereby enabling personalized recommendations for posting messages.
Another line of related research is in the area of information flow and diffusion. Studies such as [11, 13, 5] have analyzed how factors such as the topological structure of social networks play a role in information cascades. Yang et al.  presented results on analyzing message flow based on Twitter mentions, and found that long-term historical user properties such as the rate of previous mentions were as important as the tweet content. The authors in  studied the importance of hashtag adoption in determining the popularity and spread of tweets. The study in  proposed a predictive approach to model dynamics of diffusion in social networks based on social, semantic and temporal dimensions. However, the problem of examining the flow of messages in the entire network differs significantly from the one in our study. Here we are instead concerned with the reactions received by a single user in a short time window.
Sometimes the post behavior is used in the context of one-on-one or personal communication, while other times it may be geared towards a larger audience. Here we focus on the latter case, where one of the motivations behind posting is to reach a large audience and to capture their attention. In particular, we examine the time-related aspects of this behavior and frame a when-to-post problem as follows:
For a user on a social network, find the best time to post a message within a specified time period in order to maximize the probability of receiving audience reactions.
Note that we only consider first-degree reactions such as replies and retweets on Twitter and comments on Facebook, and not those caused by an audience member resharing the original post. In other words, we focus mainly on the reactions a post receives by the user’s immediate audience, and not on how the post propagates through the network.
System Overview and conclusions
After using several important platforms for performing Big Data Analytics and data streaming, such as Apache Hadoop clusters for distributed analytics and storage, Hive for process, query and manage datasets, all implemented with Java utilities for user defined wrappers, the authors arrive at a system overview:
We collect user posts from Facebook through the oauthtoken provided by registered users on Klout. We also use the oauth-token-based approach to collect the friend graph of users on Facebook and the follower graph for users on Twitter. Klout partners with GNIP to collect public data generated in the Twitter Mention Stream2 . For location analysis, we use the city, state and country information provided by registered users on the Klout application. The collected data is written out to a Hadoop cluster3 that uses HDFS as the file system, HBase as the serving datastore, and Hive4 to process, query and manage the large datasets. We implement independent Java utilities with Hive UDF (User Defined Function) wrappers, with functions to process user locations and timezones, and operators such as discrete convolution to process time-series vectors.
The combination of Hive Query Language and UDFs allows us to build map-reduce jobs that can scale up to analyze billions of messages posted to social platforms every day.
The authors proceed with a series of definitions and notations (reaction-times, post-to-reaction filter, correlation between posts’ reaction-times within the user social graph and within cities, cross-network co-sine similarity, cross-city analysis, etc..), the recommended schedule derivation and evaluation, all presented with the detailed tables and graphical representation of the empirical findings. I would recommend the reading of the whole paper to all interested.
For now the main conclusion point will be highlighted here:
In this study, we introduce and formulate a when-to-post problem to find the best times to post on social networks in order to increase the number of received reactions.
We analyze various factors that affect audience reactions on a dataset containing over a billion reactions on hundreds of millions of messages. We find that a majority of reactions occur within the first 2 hours of posting times on most networks. Audience behavior differs significantly on different networks, with Twitter having larger reaction volumes in shorter time windows as compared to Facebook. We also perform location analysis and find interesting similarities and differences between cities in terms of reaction patterns. Future studies could also study other factors such as content and topical relevance of posted messages.
Further, we present multiple approaches for deriving personalized posting schedules for users, and compare them to two baselines. We evaluate these schedules on empirical data from 0.5 million real-world users and 25 million messages observed over a 56-day period. We find that the First-Degree Weighted Schedule performs the best among all, providing a reaction gain of 17% on Facebook and 4% on Twitter. Both first-degree schedules perform better on Facebook and both weighted schedules perform better on Twitter. These schedules are deployed on a full production system that recommends posting times to millions of users daily.
We hope that this study and the accompanying dataset provided enables further research in this area.
Inserted Image: Social Media Research: Media Studies & Communication