Clone wiki

LegEx Data / Legislative Explorer Codebook

LegEx is driven by an underlying MySQL database with three primary tables containing information about members, bills, and bill actions. The following codebook describes the data as available for download as well as the output from the python scripts included in this repository. Any issues with existing data should be reported to stramp@uw.edu.

Data Linkages

Each of these tables is available to download separately as a text file, and can be linked together based on the relationships outlined in the table below:

Members Bills Actions
thomas SpThomasID
cong Cong
idNew billID

In narrative form, each member has a unique thomas id number and there is one record for each member in each cong. The bills table also identifies the member by SpThomasID (same as thomas in members) and Cong. The idNew field in bills is linked to the billID variable in actions.

Common Variables:

These variables are common across all three tables:

  • bill: bill number
  • cong or Cong: congress of bill
  • billtype or BillType: type of bill or resolution
  • Bills (binding, signed by President)
    • hr – House Bill
    • s – Senate Bill
  • Resolutions (non-binding, not signed)
    • hres – House Simple Resolutions
    • hconres – House Concurrent Resolutions
    • hjres – House Joint Resolutions
    • sres – Senate Simple Resolutions
    • sconres – Senate Concurrent Resolutions
    • sjres – Senate Joint Resolutions

Members Table

This table contains one record for each Member of Congress for every term in which they served. All data is from the congress-legislators github, unless otherwise noted.

  • id: primary key
  • ICPSR: an identifier for each legislator commonly used, though not a link for this data.
  • thomas: the congress.gov identifier for each legislator. This is the primary identifier used in this database for a member.
  • govtrack: yet one more identifier for each member
  • first: member’s first name
  • last: member’s last name
  • gender: member’s gender (F, M)
  • type: rep--Representative; or sen--Senator
  • state: state
  • start: date member took office in current term
  • end: date member left office in current term
  • district: numeric identifier for house districts, 0 for Senate
  • class: Senate indicator of which election cycle member is in. Years given in reference to the current terms below, but can be extrapolated back by subtracting units of six years.
    • 1: Reelected in 2012
    • 2: Up for reelection in 2014
    • 3: Up for reelection in 2016
  • party: Party of legislator
  • simpleParty: Generic party identification (D,R or I)
  • DW1: numeric representation approximating ideology, the first dimension score from Poole and Rosenthal’s DW-Nominate.
  • DWisEst: [0,1] a 1 indicates that the listed DW1 value is an estimate based on party mean for the given Congress because the actual DW1 value was not available for this member.
  • ComC: [0,1] Was member the chair of any committee in a given congress?
  • ComR: [0,1] Was member the ranking minority member of any committee in a given congress?
  • LeadCham: [0,1] Did member lead the chamber during a given congress (House Speaker or Senate Majority Leader)?
  • sComC: [0,1] Was member the chair of any subcommittee in a given congress?
  • sComR: [0,1] Was member the ranking minority member of any subcommittee in a given congress?
  • manualCorrection: [0,1] Was any of this member’s data manually updated? Most manual updates due to odd names or missing id numbers.

Bills Table

Bills: All bill data was scraped from thomas.loc.gov via congress github, with exceptions noted below.

  • idNEW: primary key, also the link with actions as billID.
  • IntrDate: Date bill was introduced.
  • ShortTitle: preferred title to display, when NULL, use OfficialTitle
  • OfficialTitle: second choice for title (much longer)
  • PopTitle: not often used and not very descriptive, but provided by the underlying data.
  • SpThomasID: the Thomas (congress.gov) id for the bill’s sponsor, also the link to the members table in combination with cong.
  • SpName, SpState, SpDist: All characteristics of bill sponsor (also available in members table)
  • UpdatedAt: Provided by data source, indicates date the bill’s record was most recently updated.
  • CoSpThID: a comma delimited list of thomasIDs for cosponsors
  • MinorBill: a filter for bills considered “minor” in nature so that they may be excluded (post office bills, land transfers, etc).
  • compLaw: when not NULL, indicates a companion bill that this bill was “folded into”. This bill should be considered law at the same time as the identified companion bill. Scraped separately from Thomas for the 103rd-112th congresses.
  • Major: (only partially available for current Congress) Policy Agendas major topic code, from the Congressional Bills Project.
  • Minor: (only partially available for current Congress) Policy Agendas minor topic code, from the Congressional Bills Project.
  • ChRef: [0,1] Was sponsor chair of committee of referral?
  • RankRef: [0,1] Was sponsor ranking minority member of committee of referral?
  • MemRef: [0,1] Was sponsor a member of a committee of referral?
  • SubChRef: [0,1] Was sponsor a subcommittee chair for a committee of referral?
  • SubRankRef: [0,1] Was sponsor a subcommittee ranking minority member for a committee of referral?
  • Majority: [0,1] Was bill sponsored by a member of the majority?
  • Senate: [0,1] Is this bill/resolution from the Senate?
  • commRefs: A comma-delimited list of committees the bill was referred to, using congress.gov committee abbreviations described below.
  • URL: URL for official beta.congress.gov page for bill.
  • isBill: [0 = resolution,1 = bill] Is this a bill?

Note: Committee data comes from the following sources, and was linked to bill referral data in order to create the bill-level variables. A crosswalk (.csv) is available which provides various labeling schemes used for congressional committees by different organizations.

Actions Table

This is the same data that is listed in the “all actions” page for a bill at beta.congress.gov, but not every action is included. In particular, we did not capture subcommittee actions.

  • actionID: primary key for each bill action
  • billID: sequential bill identifier, tied to billsNEW
  • acted_at: date action occurred
  • loc: location of action (described in separate pdf) NOTE: Calendar actions are only available from the 97th Congress to present for the Senate and 101st to present for the House.
  • status: type of action (described in separate pdf)
  • actno: for diagnostic purposes, indicates sequential number of action in .json file sequence
  • subbill: this is an integer indicator that notes when a bill “splits off” into multiple sub-bills by being referred to multiple committees. It is only applied to actions at the committee level. The first committee continues in the 0 subbill, while additional referrals will lead to subbills 1, 2, 3, etc.

Updated