Simple random sampling: Not so simple

This was originally posted on the BIDS blog, here.

Simple random sampling is drawing k objects from a group of n in such a way that all possible subsets are equally likely. In practice, it is difficult to draw truly random samples. Instead, people tend to draw samples using

  1. A pseudorandom number generator (PRNG) that produces sequences of bits, plus
  2. A sampling algorithm that maps a sequence of pseudorandom numbers into a subset of the population.

Most people take for granted that this procedure is a sufficient approximation to simple random sampling. If it isn’t, then many statistical results may be called into question: anything that relies on sampling, including permutations, bootstrapping, and Monte Carlo simulations, may give biased results.

This blog post is a preview of what I plan to talk about at the UC Berkeley Risk Management Seminar on Tuesday, February 7. This is joint work with Philip B. Stark and Ronald Rivest.

 

Finite state space

A PRNG is a deterministic function with several components:

  • A user-supplied seed value used to set the internal state
  • A function that maps the internal state to pseudorandom bits
  • A function that updates the internal state

image1.pngThe internal state of a PRNG is usually stored as an integer or matrix with fixed size. As such, it can only take on finitely many values. PRNGs are periodic: if we generate enough pseudorandom numbers, we will update the internal state so many times that the PRNG will return to its starting state.

This periodicity is a problem. PRNGs are deterministic, so for each value of the internal state, our sampling algorithm of choice will give us exactly one random sample. If the number of samples of size k from a population of n is greater than the size of the PRNG’s state space, then the PRNG cannot possibly generate all samples.

This will certainly be a problem for most PRNGs when n and k grow large, even for those like the Mersenne Twister, which is widely accepted and used as the default PRNG in most common software packages.

 

Cryptographically secure PRNGs

One solution is to use PRNGs that have an infinite state space. Cryptographers have worked extensively on this problem, but cryptographically secure PRNGs haven’t yet become mainstream in other fields. They’re a bit slower than the PRNGs in wide use, so they’re typically reserved for applications where security is important. For the purpose of sampling, the bulk of the computational time will be spent in the sampling algorithm and not in the PRNG, so we are less concerned.

Hash functions take in a message of arbitrary length and return a hashed value of fixed length (e.g. 256 bits). A cryptographic hash function has the additional properties that it is computationally infeasible to invert in polynomial tim; it’s difficult to find two inputs that hash to the same output; small changes to the input message produce large, unpredictable changes to the output; and the output bits are uniformly distributed. These properties are amenable to generating pseudorandom numbers. The diagram gives a cartoon of how a hash function operates on a message x to output a hashed value h(x).

image2.pngWe are developing plug-in PRNGs based on the SHA256 hash function for R and Python. The Python package is in development on GitHub. The code is currently only prototyped in Python, but watch our repository for a sped up C implementation.

 

Tests for pseudorandomness

Generating pseudorandom numbers with traditional PRNGs is a problem when n and k grow large, but how do they perform for small or moderate n and k? I would argue that if we’re using PRNGs for statistical methods, we should judge their performance by how well they can generate simple random samples. We are actively working on testing PRNGs for this goal and hope to have a paper out later this year. Stay tuned!

Gender Issues Roundtable Discussion: A Case Study in Uncomfortable Conversations

This blog was originally posted on the BIDS blog, here, and was written with Rebecca Barter, Ryan Giordano, and Sara Stoudt.

In 2015, UC Berkeley experienced a series of high-profile sexual harassment incidents, prompting the graduate students in the Statistics Department to hold a roundtable event. While this response was initiated by overt sexual harassment, our discussion extended to other subtler gender biases. This article outlines the events that lead to the roundtable’s inception, the details of organizing and hosting the event, and our thoughts on what did and didn’t work.

The Situation

The most high-profile incident involved Astronomy Professor Geoffrey Marcy, who was found in violation of campus policies after several women came forward accusing him of sexual harassment over the course of his tenure at UC Berkeley. To add insult to injury, upon uncovering details of the situation long before it became public, some claimed that the university mishandled the case by simply giving Marcy a metaphorical slap on the wrist. It was not until faculty in his department circulated an open letter of disapproval that Marcy finally stepped down from his position.

It was becoming clear that sexual harassment was pervasive at all levels of the academic ladder and that such incidents were not being properly addressed or punished. While the campus scrambled to respond to the Marcy incident, several more stories received media attention: to name a few, an executive assistant at Berkeley Law accused the dean of the law school of inappropriately touching her and making her feel uncomfortable, for which he was found guilty; several undergraduate women came forward, stating that they sought help from various resources on campus after being sexually assaulted, only to be dismissed and not taken seriously by the officials charged with protecting them; and the Vice Chancellor of Research resigned amid an investigation of alleged sexual harassment of a former employee. In fact, there have been reports of a whopping 19 campus employees violating sexual harassment policies since 2009.

As the scope of these offenses became public, the campus responded by creating new leadership roles to combat sexual harassment. However, many of these actions appeared only to signal that the campus was aware of the problem rather than directly providing support to students, faculty, and staff on campus.

Initiating the Roundtable Event

To take tangible steps, the chair of the Statistics Department, Michael Jordan, asked the Statistics Graduate Student Association (SGSA) to host a departmental event about these issues. The idea was to have a meeting for the students that was run by the students rather than a mandatory meeting held by an administrator.

Professor Jordan’s suggestion was to center the meeting around “case studies” of gender bias and sexual harassment. Typical sexual harassment trainings exclusively feature case studies of behavior that are so obviously inappropriate that we felt that such scenarios offer little to discuss. Instead, we decided to focus the case studies on more common, yet subtler, experiences of implicit biases and gender-based discrimination. Furthermore, we thought that it would be more impactful if the scenarios we discussed were not fictional but were real stories from our colleagues. Most, if not all, women we know have experienced some form of gender biases or sexual harassment. We subsequently created a form for our colleagues to anonymously submit an anecdote describing a situation that they or someone who they know experienced. With a collection of anonymous stories, we could discuss case studies like is done in a typical sexual harassment training but could bring the discussion closer to home.

We reached out to several groups on campus for guidance on how to run the event in a way that would be not just useful and informative but would also encourage participation and open discussion of the issues.  Maria Lucero-Padilla from the Office for Prevention of Harassment and Discrimination (OPHD) generously shared many materials and guidelines with us in preparation for our discussion. She helped us identify themes among the case studies we collected and give structure to the conversation. We would like to thank her for her help.

Our first gender biases roundtable discussion in 2015 was met with overwhelmingly positive responses, and we decided to repeat the event in 2016 with new case studies and new themes. However, we strongly believe that the conversation about gender issues is not over. By writing about our experience running this event, we hope to teach others holding similar events the lessons we learned and to get feedback on how to expand this program to encompass other types of diversity beyond gender.

Planning and Running the Roundtable Event

The planning committee

In order to emphasize that participation in such a discussion is important for both men and women, we decided to enlist members of both genders to help plan and run the meeting. A meeting run only by men runs the risk of being construed as condescending or “mansplaining,” whereas a meeting run only by women might make male attendees feel uncomfortable and attacked. We recognize that not all individuals identify with this gender binary; however, all participants from our department identified with one of these two genders, so we chose to draw this distinction when reflecting on participants’ roles and experiences.

Collecting anecdotes

The first year, we circulated a call for submissions and a form to submit anonymous anecdotes to women in the Statistics Department and a limited number of people involved with diversity issues at the Berkeley Institute for Data Science (BIDS). The second year, we circulated the form more widely to the entire Statistics Department and to all affiliates at BIDS. Both times, we received fewer than 10 submissions of stories.

Although they were submitted anonymously, it was apparent by the context of the stories that the responders were either postdoctoral scholars or faculty rather than PhD or MA students. There are a number of possible reasons for this: these more senior individuals have been in academia longer (especially during times when the workforce was a much more difficult place for women than it is today), have participated in a wider breadth of activities, have progressed through more stages of life, and have thus had more time to accumulate experiences.
It is also possible that female graduate students in our department chose not to submit anecdotes because they were anxious about sharing them either because they felt they’ll be judged or because the stories involved men in the department who participants might recognize. It is also possible that they really haven’t encountered any biases (or simply did not recognize such biases when faced with them), in which case statistics is a very special field.

Choosing anecdotes

Several common themes emerged in the gender bias anecdotes we received. Some stories observed competitions in which the woman was subtly undermined (e.g., when hiring new faculty). Several women described repeatedly being seen as the “token woman” (e.g., being asked about babies instead of about research or being the first woman ever hired in a department). Several anecdotes about sexual harassment involved status and power dynamics (e.g., students writing inappropriate comments to their GSI in evaluations and tensions between colleagues in the same lab).

We chose several anecdotes to discuss. In the first year, two were about implicit biases related to hiring decisions and one was about sexual harassment in a teaching role. In the second year, we used four anecdotes covering implicit bias, sexual harassment by a colleague, sexual harassment by a student, and the issue of getting senior leaders to value diversity.

During the event

The meetings lasted for about two hours and were held in the evening (from 5:00 to 7:00 p.m.) in order to minimize conflicts with class times. Professor Jordan kindly funded dinner in order to encourage attendance. Both years, we had around 16 attendees. The group was about three-quarters male, which reflects the gender ratio in the department.

We began the meeting by explaining ground rules for the discussion (1. Focus on the statement, not the speaker; 2. Assume the best intentions of the speaker; 3. Listen and try not to feel personally attacked) and a short listening exercise in pairs, which set the tone of the discussion nicely. We found that because we set expectations for the discussion, people were more receptive to discussing uncomfortable topics and were ready to respectfully listen to each other.
Next, we talked about the anecdotes. During this portion of the meeting, we presented one case study at a time and asked the participants to form small groups of three to four in order to discuss the story for five minutes, providing them some guiding themes or questions. Afterwards, we invited volunteers from the groups to tell the whole room something interesting that they talked about.
Finally, both years, we concluded by asking the group what next steps the SGSA should take to grow its diversity and inclusion efforts in the future. After the first year, the question was whether people wanted to repeat such meetings in the future. The response was positive. After the second year, we asked about how we might expand the scope of such meetings to include other aspects of diversity beyond gender and how to include the entire department in larger discussions.

Lessons Learned

Below, we outline some of the main factors that we believe contributed to the success of the roundtable events.

Small group discussions are a good way to encourage participation

Based on our observations, it looked like everyone in the small groups spoke to each other. When we brought the discussion back to the whole table, only a few outgoing people volunteered ideas. Forming small groups allows even the quietest people to participate because they don’t feel like the spotlight is on them.

There may be some need to informally “spread out” the women participants to various small groups so that each small group benefits from perspectives from both genders. Although we recognize that it can be difficult to be the lone representative of women in a group, the gender ratio of participants necessitated this.

Individuals enjoy and benefit from this event regardless of their gender

We thought hard about how to keep the discussion from being a rote affirmation of what we all already know are the social norms around sexism. There is a danger in these types of events that everyone goes through the motions that we know are expected of us without really gaining any new perspective on gender issues.

We wanted the roundtable to be the kind of discussion that would allow people to express contradicting opinions and come in contact with genuinely different points of view. To facilitate this, we chose anecdotes that were relatively ambiguous incidents that might seem superficially minor but become problematic when viewed in the context of what women deal with in a sexist society. This both allowed participants to freely advocate for opinions that would not have been politically correct had the incidents been more serious and enabled women to share the many reasons why a seemingly innocuous incident could be hurtful or dangerous.

Women often talk about such issues with each other but rarely with men. It is satisfying to have this conversation openly and have the opposite sex hear about the issues that women face. The case study submission format encourages this: everyone knows that these are real stories that have happened to people who we probably know and interact with.
On the other hand, some feedback that we got from men who attended was that they liked having a venue to discuss gender issues candidly because they rarely have the opportunity. It is something they care about, but they do not know how to bring up the subject themselves.

The event works well when the participants know each other

In this regard, the timing of the event and group dynamics worked in our favor. The Statistics Department has around 50 PhD students, with eight to 12 per incoming cohort. Students take many of the same classes together during their first year, so they get to know each other well after several months. We held this event at the end of November. By this time, most of the new PhD students had met each other and were comfortable together.

This level of familiarity lowered the barrier to having an open discussion about uncomfortable issues. Contrast this with the typical mandatory sexual harassment training, which often occurs at the beginning when someone joins a new program.
In a room full of strangers, we imagine that it would be much more difficult to candidly discuss subtle issues about gender bias. Thus, it is important to invite an intimate group of people to participate in such a roundtable discussion. Note, however, the converse of this statement is that given a different group of people, running an identical event might not go well.

Participation is greatly enhanced by encouragement from senior faculty members

The downside of having the discussion later on in the semester is that it often ends up being around “crunch time” for students. Email encouragements from both senior graduate students and the Department Chair as well as in-person discussion of the upcoming event at informal department gatherings helped boost the number who RSVP’d “yes.”  In both years, a good mix of class years attended.

Getting volunteers to help lead future events is trickier. Participants agreed that future discussions should be extended to include broader themes of diversity, such as race, sexuality, and gender identity, but so far, nobody has volunteered to help organize a broader diversity discussion. The tension here is that the students currently leading the gender discussion (three Caucasian females and a Caucasian male) feel that they cannot speak to other aspects of diversity, yet actively targeting others to take on that role can be problematic.

Perhaps opening up the discussion to joint venues with representation from other departments would help increase the pool of those willing to take on facilitator roles at the expense of having a group that is less familiar with one another.

Next Steps

We plan to continue holding this type of discussion at least once a year and expand the content to include all aspects of diversity. We hope to recruit additional leaders to help represent a broader base of diversity.

Any suggestions for how to move forward with these discussions or expand this program to other venues would be welcome.

Propensity score matching in Python, revisited

Update 8/11/2017: I’ve been working on turning this code into a package people can download and contribute to. Please use the package, linked here, instead of the code I shared in a Jupyter notebook previously.

I can’t believe how many people from all around the world visit my previous blog post on propensity score matching in Python every day.  It feels great to know that my code is out there and people are actually using it.  However, I realized that the notebook I link to previously doesn’t contain much and that I wrote heaps more code after posting it.  Hence, I’m sharing a more complete notebook with code for different variations on propensity score matching, functions to compute average treatment effects and get standard errors, and check for balance between matched groups.