Advanced dedup help + linking lookup table

pmonaghan

Registered User.
Local time
Today, 09:01
Joined
Jul 17, 2012
Messages
24
Thanks for looking.

I'm currently working with a very large database which contains a lot of duplicates. I had a lot of help in my last thread but I have a few more questions I need to find answers for.

The project I am working on is based on email analysis. I have restored five mailboxes and imported them into a database for filtering /analysis. Duplicate emails exist across and within these mailboxes and I am trying to remove these duplicates unless they are in a designated 'protected email folder'. To make things more complicated I also need to give priority to the emails that I am deleting; for example if duplicates are identified, delete first from mailbox C, then from mailbox B then from mailbox A- mailbox A has top priortiy- email from mailbox C gets deleted before mailbox B which gets deleted before mailbox A.

I am basing the dedup process on four queries and two look up tables.

Lookup Tables

Lookup Table 1: tblFilePriority
Contains two fields [File] and [FilePriority]. The File field contains the name of the mailbox and the FilePriority field contains a number (1-5) to give each mailbox a priority value.

Lookup Table 2: tblProtectedFolders
Contains two fields [Folder] and [Protected]. Folder contains a list of all mailbox folders and Protected lists a 1 next to a folder which must not be deleted (even if duplicates exist) and a 0 next to folders which may be deleted if a duplicate is found.


Queries

Query 1: qryFindImportantEmails
SELECT Min(tblFilePriority.FilePriority) AS MinOfPriority, tblEmails.From, tblEmails.Subject, tblEmails.Sent
FROM tblEmails INNER JOIN tblFilePriority ON tblEmails.File = tblEmails.File
GROUP BY tblEmails.From, tblEmails.Subject, tblEmails.Sent;


Query 2: qryFindProtectedFolders
I need help on this....


Query 3: qryFindDuplicateEmails
SELECT tblEmails.*
FROM tblEmails INNER JOIN qryImportantEmails ON (tblEmails.from = qryImportantEmails.from) AND (tblEmails.Subject = qryImportantEmails.Subject) AND (tblEmails.sent = qryImportantEmails.sent)
WHERE (((qryImportantEmails.MinOfFile)<>[tblEmails].[File]));

This does not run at the moment. I get an 'enter parameter value' box.

Query 4: qryDeleteDuplicateEmails
DELETE *
FROM tblEmails
WHERE tblEmails.ID
IN
(SELECT ID FROM qryDuplicateEmails);




As you may have guessed, I'm pretty new to Access so I won't be offended if you feel the need to explain things at a basic level.

Thanks again for reading.

Pete​
 
In this query you reference qryImportantEmails

Query 3: qryFindDuplicateEmails
SELECT tblEmails.*
FROM tblEmails INNER JOIN qryImportantEmails ON (tblEmails.from = qryImportantEmails.from) AND (tblEmails.Subject = qryImportantEmails.Subject) AND (tblEmails.sent = qryImportantEmails.sent)
WHERE (((qryImportantEmails.MinOfFile)<>[tblEmails].[File]));


But based on your post the query name is not qryImportantEmails but rather qryFindImportantEmails. Having the wrong query name will give you the parameter error.

Now regarding Query 2: qryFindProtectedFolders

How does tblEmails relate to tblProtectedFolders? I am assuming that they join by the field called folder. If so you could just join them in a query like this:

SELECT * FROM tblEmails INNER JOIN tblProtectedFolders ON tblEmails.folder=tblProtectedFolders.folder
 
Thanks for replying. I'm afraid I got a bit mixed up with my table names there. Another search I ran in the mean time is below. This has the right query name but still gives the parameter error. The error says:

Enter Parameter Value
FLImportantEMails.MinOfFile

If that helps.


FL Queries

Query 1: FLImportantEmails
SELECT Min(tblFilePriority.FilePriority) AS MinOfPriority, [FLEmails].From, [FLEmails].Subject, [FLEmails].Sent
FROM FLEmails INNER JOIN tblFilePriority ON [FLEmails].File=tblFilePriority.File
GROUP BY [FLEmails].From, [FLEmails].Subject, [FLEmails].Sent;



Query 2: qryFindProtectedFolders

I need help on this....





Query 3: FLDuplicateEmails
SELECT FLEmails.*
FROM FLEmails INNER JOIN FLImportantEmails ON (FLEmails.from = FLImportantEmails.from) AND (FLEmails.Subject = FLImportantEmails.Subject) AND (FLEmails.sent = FLImportantEmails.sent)
WHERE (((FLImportantEmails.MinOfFile)<>[FLEmails].[File]));


Query 4: DeleteDuplicateFLEmails
DELETE *
FROM FLEmails
WHERE FLEmails.ID
IN
(SELECT ID FROM FLDuplicateEmails);


With regards to query 2, I don't understand how that select query would take the protected status into account.. Sorry if thats a bit novice!​
 
Query 1: FLImportantEmails
SELECT Min(tblFilePriority.FilePriority) AS MinOfPriority, [FLEmails].From, [FLEmails].Subject, [FLEmails].Sent
FROM FLEmails INNER JOIN tblFilePriority ON [FLEmails].File=tblFilePriority.File
GROUP BY [FLEmails].From, [FLEmails].Subject, [FLEmails].Sent;



Enter Parameter Value
FLImportantEMails.MinOfFile

I do not see a field called MinOfFile in the FLImportantEMails query. I do see MinOfPriority.

Could you provide the names of the fields in tblEmails (or whatever it is actually called)? With that I may be able to help more effectively.
 
Sure.

FLEmails fields:

ID | File | Folder | From | To | cc | Bcc | Attachments | Subject | Body | Sent | Received
FLImportantEmails fields:

MinOfPriority | From | Subject | Sent
 
Last edited:
I assume that the joins to the lookup tables are made via the file and folder fields of the FLEmails table.

Are there matching records in the 2 lookup tables for every record in FLEmails? If so, then, I would join the 3 tables first via INNER JOINS

BTW, you cannot have 2 fields with the same name (i.e. from).

query name: qryDataWithPriorityAndProtectedStatus
SELECT FLEmails.ID, tblFilePriority.FilePriority, tblProtectedFolders.Protected, FLEmails.From, FLEmails.To, FLEmails.CC, FLEmails.BCC, FLEmails.Attachments, FLEmails.Subject, FLEmails.Body, FLEmails.Sent, FLEmails.From2
FROM tblProtectedFolders INNER JOIN (tblFilePriority INNER JOIN FLEmails ON tblFilePriority.File = FLEmails.FILE) ON tblProtectedFolders.Folder = FLEmails.Folder;

The above query now has the Filepriority and Protected status in one data set, so now you can use the above query to create queries that pull the records you want.

So to pull all of the protected records

SELECT ID
FROM qryDataWithPriorityAndProtectedStatus
WHERE qryDataWithPriorityAndProtectedStatus.Protected=1

Now you can create a query with subquery to identify those records with priority 1 that can be deleted

DELETE *
FROM FLEmails
WHERE FLEmails.ID
IN (SELECT ID
FROM qryDataWithPriorityAndProtectedStatus
WHERE priority=1 and ID NOT IN (SELECT ID
FROM qryDataWithPriorityAndProtectedStatus
WHERE qryDataWithPriorityAndProtectedStatus.Protected=1))
 

Users who are viewing this thread

Back
Top Bottom