Extract email address from text

Robster

Registered User.
Local time
Today, 00:09
Joined
Mar 13, 2014
Messages
60
I have a field which contains various text including an email address which i need to extract.
My research says that I need to use a regex code but dont know how to get that into Access.

I have attached an example of the file i'm importing into Access.

Any help appreciated.
 

Attachments

Last edited:
You will need 2 string manipulation functions:

Mid (http://www.techonthenet.com/access/functions/string/instr.php) which extracts a substring

InStr (http://www.techonthenet.com/access/functions/string/instr.php) which finds the position of a character in a string.

From the sample data you posted, it looks like all your email addresses are immediately after the '<' character and before the '>' character. If that's the case, this is the pseudo code for your function:

str_Search = text to search
str_Tmp = temporary string used to manipulate str_Search
StartPos = position of '<'
EndPos = position of '>'
str_Email = email value, will be returned

Use InStr() to find StartPos
Use Mid() to set str_Tmp to substring of str_Search starting at StartPos
Use InStr() to find EndPos
Use Mid() to set str_Email to substring of str_Tmp ending at EndPos
Return str_Email
 
Here's some code that uses regex.(Regular expressions)

You do need to set a reference to the Microsoft VBScript Regular Expressions library

I can not say it will accept or reject emails that don't meet the standard.
See this link for info on complexity.
More info re regex

Code:
'---------------------------------------------------------------------------------------
' Procedure : Emailfinder
' Author    : Jack
' Date      : 18/02/2014
' Purpose   : This will get valid email addresses from a string according to link.
'  Found "best regex pattern for email validation" at
'  http://blog.trojanhunter.com/2012/09/26/the-best-regex-to-validate-an-email-address
' Pattern:  [-0-9a-zA-Z.+_]+@[-0-9a-zA-Z.+_]+\.[a-zA-Z]{2,4}
'
' This routine will return multiple email addresses when there are multiples in the string.
'
''                                ---------------------------------------
'*** Requires a reference to the  Microsoft VBScript Regular Expressions  library
'                                 _______________________________________
'---------------------------------------------------------------------------------------
'
Function Emailfinder(t As String) As String
          Dim MyRE As Object
10        Set MyRE = New Regexp
          Dim MyMatches As MatchCollection
          Dim MyResult As String
          'set the email pattern
20        On Error GoTo Emailfinder_Error
30        Emailfinder = ""  'set to empty string
40        MyRE.Pattern = "[-0-9a-zA-Z.+_]+@[-0-9a-zA-Z.+_]+\.[a-zA-Z]{2,4}"
          
50        MyRE.Global = True
60        MyRE.IgnoreCase = True
70        Set MyMatches = MyRE.Execute(t)
80        If MyMatches.Count > 0 Then
90            For Each MyMatch In MyMatches
100               MyResult = MyResult & MyMatch.Value & vbCrLf
110           Next
120           Emailfinder = MyResult  ' one or more valid email addresses
130       Else
140           Emailfinder = ""    'empty string
150       End If
160       On Error GoTo 0
170       Exit Function

Emailfinder_Error:

180       MsgBox "Error " & Err.number & " (" & Err.Description & ") in procedure Emailfinder of Module Module1"
End Function

Here is a test routine to pick email addresses out of text strings
Code:
Sub sometest()
          Dim s(14) As String 'test data
          Dim i As Integer
10       On Error GoTo sometest_Error

20        s(0) = "h="
30        s(1) = " 1="
40        s(2) = " 10pt;>.</span></p>=jim.XX@samoa.net.au<p= ==mmgf  hank@gmail.com gerry@Att.net"
50        s(3) = "(301)277-="
60        s(4) = " <test@oceanfire.com;bill@gmail.com>"
70        s(5) = "<test@oceanfire.com>;"
80        s(6) = "Test@ promotioncorner.com"
90        s(7) = "<test@oceanfire.com>"
100       s(8) = "<20140206130446.BB05F2995@diligence.cnchost.com>"
110       s(9) = "<test@promotioncorner.com>"
120       s(10) = "<test@wadecorporategifts.com>"
130       s(11) = "<0Lh8Cl-1VPCF507ZX-00oYM5@mx.perfora.net>"
140       s(12) = "<test@vectorpromo.com>"
150       s(13) = "-0500"
160       s(14) = " Test@ promotioncorner.com"

170       For i = 1 To UBound(s)
180           If Emailfinder(s(i)) <> "" Then
190               Debug.Print i & "  " & s(i) & " * " & vbCrLf & Emailfinder(s(i))
              Else
                  Debug.Print i & "  " & s(i) & "  Nothing here "
200           End If
210       Next i

220      On Error GoTo 0
230      Exit Sub

sometest_Error:

240       MsgBox "Error " & Err.number & " (" & Err.Description & ") in procedure sometest of Module Module1"
End Sub

Test results :

Code:
1   1=  Nothing here 
2   10pt;>.</span></p>=jim.XX@samoa.net.au<p= ==mmgf  hank@gmail.com gerry@Att.net * 
jim.XX@samoa.net.au
hank@gmail.com
gerry@Att.net

3  (301)277-=  Nothing here 
4   <test@oceanfire.com;bill@gmail.com> * 
test@oceanfire.com
bill@gmail.com

5  <test@oceanfire.com>; * 
test@oceanfire.com

6  Test@ promotioncorner.com  Nothing here 
7  <test@oceanfire.com> * 
test@oceanfire.com

8  <20140206130446.BB05F2995@diligence.cnchost.com> * 
20140206130446.BB05F2995@diligence.cnchost.com

9  <test@promotioncorner.com> * 
test@promotioncorner.com

10  <test@wadecorporategifts.com> * 
test@wadecorporategifts.com

11  <0Lh8Cl-1VPCF507ZX-00oYM5@mx.perfora.net> * 
0Lh8Cl-1VPCF507ZX-00oYM5@mx.perfora.net

12  <test@vectorpromo.com> * 
test@vectorpromo.com

13  -0500  Nothing here 
14   Test@ promotioncorner.com  Nothing here

----Good luck ---------
 
Last edited:
Thanks Plog.

I've managed to remove all characters before the '<' but how do I do a Mid on the remainder when each one is a different length?
 
You find the end position, then you use that in your next mid:

Mid(str_Tmp, 1, EndPos)
 
Thanks. I had worked it out.

Thanks for your help.
 

Users who are viewing this thread

Back
Top Bottom