Help needed in linking two fields with different way of writing data

  • Thread starter Thread starter onur
  • Start date Start date
O

onur

Guest
I am currently trying to analyze the presidential campaign contributions made by employees of large US companies for my thesis ( I am trying to link contributions made by all emplyees to stock returns on election day). I have a few problems concerning linking the data.

I have a table with the names of the companies, their ticker, CEO name, etc, which looks like this:
Code:
Ticker	Company Name	Market Value	Industry	CEO Name
ADSK	Autodesk 	342		19		Carol A. Bartz 
AEE	Ameren 		265		24		Gary L. Rainwater 
AES	AES 		242		24		Paul T. Hanrahan 
AET	Aetna 		115		14		John Wallis Rowe

I also have a table (with over 740,000 records) which contains the campaign contributions to certain candidate committees, as follows:

Code:
Committee Id	Contributor		Occupation	                        Amount
C00010603	Sant, Roger W Mr.	AES Corporation/Chairman		2000
C00383653	Saint, Alexia		AES Corparation/Business Analyst	1000
C00383653	Jayaratne, Josh		AES Consulting LLC/Consultant		200
C00383653	Crane, Eric C		Autodesk Inc./Attorney			1000
C00003418	Cohen, Sarah Ms.	A.T. & T./Sales				250
C00010603	lucas, jack	         AT&T/software engineer			250
C00010603	Ludolph, Charles	Consultant				200
C00010603	Lucas, Freddy		self/Dad				999
C00010603	Luckhardt, Jack C Dr.	Retired					300
C00010603	Luckhardt, John C						500
C00010603	Marsh, Kris Ms.		Aspen Valley Medical/Nonprofit Exec	250
C00010603	Marsh, Dave		self employed/audio producer		500
C00010603	Marsh, Leslie		RMACInc./Insurance Executive		250

The two links I would like to make are:
1. Using the company name from the first table(and linking with the Occupation field in the second table), find the total donations made by employees of a company to a specific committee,
2. Using the CEO name (and linking it with Contributor field of the second table) find out what the contribution is of the CEO of each company to a specific Committee.

The problem with the first link is that the company name in the second table is part of the general field "Occupation" which also contains the job title, such as "AT&T/software engineer" so that I cannot make a direct link with the "AT&T" company name from the first table. Is their a possible way to link one table with another using wildcards or is their possible another solution (looking within a field for a certain string)?

There is also the problem that different names are used, for example AT&T and A.T. & T. or AES, AES Corp and AES Corporation. I think the solution to this is to try and make a table with all possible versions of a companies name, or would their be a simple way to automate this as well?

The problem with the second link is that the names of contributors aren't written in a standard form, which also means a direct relationship cannot be made. Any ideas on this?

As you can read, it is quite a complex problem, and as I have no programming skills I am finding it hard to link this data. I would appreciate any input on this, as I cannot manually go through 740,000 records to look at the amounts that employees of all S&P 500 companies have conributed.

Thanks in advance for your help!

Onur
 

Users who are viewing this thread

Back
Top Bottom