Conversion of 4 columns of binary data into one column with text

fboehlandt

Registered User.
Local time
Today, 21:55
Joined
Sep 5, 2008
Messages
90
Hi everyone,
I have a problem regarding the presentation of data in one of the tables of my database. Let's assume for simplicity's sake the table looks like the following (Fundcode could be the unique primary key):

<Fundname> <Fundcode> <Europe> <America> <Asia> <Other>
<Fund 1> <101> <-1> < 0> < 0> < 0>
<Fund 2> <102> < 0> <-1> < 0> < 0>
<Fund 3> <103> < 0> < 0> <-1> < 0>
<Fund 4> <104> <-1> < 0> < 0> < 0>
etc...

Where Europe, America etc. are regions an investment fund can be invested into. (-1) here denotes that the fund is invested, (0) indicates that the fund is not invested. Thus, those variables are infact binary switch variables (why no check boxes were used is beyond me but the same principle applies). I would like the table to be displayed in a more condensed form:

<Fundname> <Fundcode> <Region>
<Fund 1> <101> <Europe>
<Fund 2> <102> <America>
<Fund 3> <103> <Asia>
<Fund 4> <104> <Europe>
etc...

To make matters worse a given fund can be invested in more than one region, e.g.:

<Fundname> <Fundcode> <Europe> <America> <Asia> <Other>
<Fund 5> <103> < 0> < 0> <-1> <-1>
<Fund 6> <104> <-1> < 0> < 0> <-1>

In which case two or more rows must be included to account for the different regions (this would obviously require a new primary key):

<ID> <Fundname> <Fundcode> <Region>
<1> <Fund 1> <101> <Europe>
<2> <Fund 2> <102> <America>
<3> <Fund 3> <103> <Asia>
<4> <Fund 4> <104> <Europe>
<5> <Fund 5> <105> <Asia>
<6> <Fund 5> <105> <Other>
<7> <Fund 6> <106> <Europe>
<8> <Fund 6> <106> <Other>
etc...

Ideally I would like to implement this as a SQL query but I realize this might be a bit optimistic. I recon a VBA script cannot be avoided but I have no idea how to go about it. Maybe converting the (-1)/(0) entries to checkboxes makes things a bit easier? Any input is greatly appreciated. Thanks in advance
 
Last edited:
this problem arises BECAUSE your data is not normalised - you shouldnt really be storing flags for different markets horizontally in the row - you should really be storing the flags vertically.

so what you need is a table that looks more like this

fund1 europe true
fund1 america false
fund1 asia false
fund2 europe false
fund2 america true
fund2 asia false

in practice, the fund name ought to be the id of the fund record
and the region ought to be the id from a region table

its also a moot point whether you want to store all the negative values, as well as the positive values - ie you may not really need the flag, you could just say (all true values only!)

fund1 europe
fund2 america
fund2 europe
fund3 asia
fund3 africa

Doing it this way makes it easy to select either
a) the areas for a given fund or
b) the funds for a given area

and even areas not used in given funds, etc

and
c) also makes it easy to bring in new areas

--------
hopefully you may be in a position to revise this - the longer you leave it, the more problems/grief this will cause, although sometimes its too entrenched

its database thinking, not spreadsheet thinking
 
I couldn't agree more. The format you are suggesting is exactly what I'm aiming for. However, I'm not the administrator of the database. Thus, any manipulation of the data can only be indirect through queries or macros. I'm aware that the data is presented in a nonsensical way but this is how the data is received from the database vendor. I'm looking for a script or sql query that returns a query displaying the data in the form you suggested. In terms of the fundname as primary key, it may not be unique (whereas fundcode always will be). I'll try around with the Iif Function for a bit and see what the results are. Any further input is most welcome. Thanks very much for your insights so far!
 
i'm not saying using the fundname at all - i'm just using this as an illustration - you would want to use whatever unique key you have

if you cant redesign the dbs, then you can construct a union query

do a query that selects the status for america, another that selects for europe etc

then you have a union query (you can do it in the access design pane, but you HAVE to be in sql mode)

select * from america_qry UNION
select * from euroipe_qry UNION
select * from asia_qry UNION
select * from africa_qry

etc

the individual queries MUST have matching columns.

note that a union query is not updateable
 
I think that's what I will try. I'm not worried about updating the query. I intend to embed it in a VBA script. Thus, it will re-run every time the database is being accessed. Not the most elegant way but faulty database design in this case wasn't my fault...Thanks for your help, I'll report back with results :)

p.s. the list of regions is exhaustive. Only new funds will be added in future
 
This is what I've come up with. Let's assume the first table (Information) looks like the following:

Fundname, Fundcode, Europe, America, Asia, Other
Fund 1, 101, -1, 0, 0, 0
Fund 2, 102, 0, -1, 0, 0
Fund 3, 103, 0, 0, -1, 0
Fund 4, 104, -1, 0, 0, 0
etc...

Again, this information should be changed to the format Fundname, Fundcode, Region,.

Table 2 (Performance) would look like this:

Fundname, Fundcode, Date, ROI, AUM
Fund 1, 101, 01/01/2007, 4%, 26
Fund 1, 101, 01/02/2007, 2%, 27
Fund 1, 101, 01/03/2007, 6%, 20
Fund 1, 101, 01/04/2007, 3%, 28
Fund 2, 102, 01/01/2007, -2%, 9
Fund 2, 102, 01/02/2007, 0%, 9
Fund 2, 102, 01/03/2007, 2%, 9
Fund 2, 102, 01/04/2007, -3%, 10
etc...

With ROI (Return of Investment) and AUM (Assets Under Management) both as performance figures.

Table 3 (Leverage) contains the following data:

Fundname, Fundcode, Borrowed
Fund 1, 101, yes
Fund 2, 102, no
Fund 3, 103, yes
Fund 4, 104, yes
etc...

The intended format of the query is as follows:

Fundname, Fundcode, Date, Leverage, Performance, PType, Region
Fund 1, 101, 01/01/2007, yes, 4%, ROI, Europe
Fund 1, 101, 01/01/2007, yes, 26, AUM, Europe
Fund 1, 101, 01/02/2007, yes, 2%, ROI, Europe
Fund 1, 101, 01/02/2007, yes, 27, AUM, Europe
Fund 1, 101, 01/03/2007, yes, 6%, ROI, Europe
Fund 1, 101, 01/03/2007, yes, 20, AUM, Europe
Fund 1, 101, 01/04/2007, yes, 3%, ROI, Europe
Fund 1, 101, 01/04/2007, yes, 28, AUM, Europe
Fund 2, 102, 01/01/2007, no, -2%, ROI, America
Fund 2, 102, 01/01/2007, no, 9, AUM, America
Fund 2, 102, 01/02/2007, no, 0%, ROI, America
Fund 2, 102, 01/02/2007, no, 9, AUM, America
Fund 2, 102, 01/03/2007, no, 2%, ROI, America
Fund 2, 102, 01/03/2007, no, 9, AUM, America
Fund 2, 102, 01/04/2007, no, -3%, ROI, America
Fund 2, 102, 01/04/2007, no, 10, AUM, America

I have come up with following consequtive union queries that rotate through the different regions. The query works but turns out to be quite long for many regions (plus the query takes abouit two to three minutes to be calculated):

Code:
SELECT Performance.Fundcode,  Performance.Fundname, Performance.Date, Leverage.Borrowed, Performance.ROI As Performance, 'ROI' As [Type], 'Europe' As Region
FROM  Performance INNER JOIN (Information INNER JOIN Leverage ON Information.Fundcode = Leverage.Fundcode)
ON Performance.Fundcode = Information.Fundcode 
WHERE Information.Europe = -1
UNION
SELECT Performance.Fundcode,  Performance.Fundname, Performance.Date, Leverage.Borrowed, Performance.AUM As Performance, 'AUM' As [Type], 'Europe' As Region
FROM  Performance INNER JOIN (Information INNER JOIN Leverage ON Information.Fundcode = Leverage.Fundcode)
ON Performance.Fundcode = Information.Fundcode 
WHERE Information.Europe = -1
UNION
SELECT Performance.Fundcode,  Performance.Fundname, Performance.Date, Leverage.Borrowed, Performance.ROI As Performance, 'ROI' As [Type], 'America' As Region
FROM  Performance INNER JOIN (Information INNER JOIN Leverage ON Information.Fundcode = Leverage.Fundcode)
ON Performance.Fundcode = Information.Fundcode 
WHERE Information.America = -1
UNION
SELECT Performance.Fundcode,  Performance.Fundname, Performance.Date, Leverage.Borrowed, Performance.AUM As Performance, 'AUM' As [Type], 'America' As Region
FROM  Performance INNER JOIN (Information INNER JOIN Leverage ON Information.Fundcode = Leverage.Fundcode)
ON Performance.Fundcode = Information.Fundcode 
WHERE Information.America = -1
etc...(for all regions)

Is there a way to shorten this query or make it quicker?
 
is this a misprint?

you have america twice and europe twice

depends how much data you have

it shouldn't take as long as minutes, but you may have a massive dbs

--------
how long does a single query take (ie not union - just one area)
 
No, the performance data should be stacked:
ROI
AUM
ROI
AUM
etc...
thus for every Fund-Date combination there will be two region entries, one for ROI and the other for AUM. I realize that the tables and queries contain a lot of duplicate data but as I said, I have no impact on the format of said database. I think I'll just stick with my query, works fine for this purpose. Thanks again,
cheers
 

Users who are viewing this thread

Back
Top Bottom