PHP regex email addres(ses) from text, sometimes right before a full stop
I have texts that can contain one email address or multiple ones. I use regex to match these. First I used: (from this previous question)
[A-Za-z0-9_-]+@[A-Za-z0-9_-]+.([A-Za-z0-9_-][A-Za-z0-9_]+)
This caused two problems. In the case a .
was used before the @
this was problematic, but also if an email address ended in two or more domain extensions (for example ...@domain.co.uk) it did not work. So I changed this expression to
^([a-z0-9_.-]+)@([da-z.-]+).([a-z.]{2,6})
This solves both first problems, but creates a new one. If in the text the email address is right before a full stop, this is now included in the address! So this text gives me problems:
Please email us at: some@example.com. You can also mail us at some@example.co.uk. Etc...
Is there a way to exclude this last .
if it is followed by either a blank space or a line break?
ps.
I do not need to validate email addresses, I need to make sure my expression knows where an email address (or multiple) are in a text and when they stop.
php regex
|
show 13 more comments
I have texts that can contain one email address or multiple ones. I use regex to match these. First I used: (from this previous question)
[A-Za-z0-9_-]+@[A-Za-z0-9_-]+.([A-Za-z0-9_-][A-Za-z0-9_]+)
This caused two problems. In the case a .
was used before the @
this was problematic, but also if an email address ended in two or more domain extensions (for example ...@domain.co.uk) it did not work. So I changed this expression to
^([a-z0-9_.-]+)@([da-z.-]+).([a-z.]{2,6})
This solves both first problems, but creates a new one. If in the text the email address is right before a full stop, this is now included in the address! So this text gives me problems:
Please email us at: some@example.com. You can also mail us at some@example.co.uk. Etc...
Is there a way to exclude this last .
if it is followed by either a blank space or a line break?
ps.
I do not need to validate email addresses, I need to make sure my expression knows where an email address (or multiple) are in a text and when they stop.
php regex
1
You could for example change your regex to([a-z0-9_.-]+)@((?:[da-z.-]+).)+([a-z]{2,6})
demo. That will repeat the part after the @ sign including the first dot 1+ times. Then omit the dot in the last part.
– The fourth bird
Nov 22 '18 at 21:06
1
@Jeff - That regexe doesn't allow emails with foreign characters, dashes or numbers, like:hello@åä-ö.com
, whileåä-ö.com
actually is a valid domain. You should also be able to have dots in the name-part. When matching email addresses (and URL's), you shouldn't be too strict.
– Magnus Eriksson
Nov 22 '18 at 21:15
1
I had them in there:..@[A-Za-z0-9_-]..
- but of course I forgot about subdomains likespam@sub-domain.my-host.com
– Jeff
Nov 22 '18 at 21:23
1
@DirkJ.Faber absolutely right. Maybe take Magnus' comments about mine (for `ä') into that aswell.
– Jeff
Nov 22 '18 at 21:25
1
Basically, look for a good already made regex for this online. Trying to do it yourself is usually really painful. If you check regexes that takes most rules into account, they are huge...
– Magnus Eriksson
Nov 22 '18 at 21:26
|
show 13 more comments
I have texts that can contain one email address or multiple ones. I use regex to match these. First I used: (from this previous question)
[A-Za-z0-9_-]+@[A-Za-z0-9_-]+.([A-Za-z0-9_-][A-Za-z0-9_]+)
This caused two problems. In the case a .
was used before the @
this was problematic, but also if an email address ended in two or more domain extensions (for example ...@domain.co.uk) it did not work. So I changed this expression to
^([a-z0-9_.-]+)@([da-z.-]+).([a-z.]{2,6})
This solves both first problems, but creates a new one. If in the text the email address is right before a full stop, this is now included in the address! So this text gives me problems:
Please email us at: some@example.com. You can also mail us at some@example.co.uk. Etc...
Is there a way to exclude this last .
if it is followed by either a blank space or a line break?
ps.
I do not need to validate email addresses, I need to make sure my expression knows where an email address (or multiple) are in a text and when they stop.
php regex
I have texts that can contain one email address or multiple ones. I use regex to match these. First I used: (from this previous question)
[A-Za-z0-9_-]+@[A-Za-z0-9_-]+.([A-Za-z0-9_-][A-Za-z0-9_]+)
This caused two problems. In the case a .
was used before the @
this was problematic, but also if an email address ended in two or more domain extensions (for example ...@domain.co.uk) it did not work. So I changed this expression to
^([a-z0-9_.-]+)@([da-z.-]+).([a-z.]{2,6})
This solves both first problems, but creates a new one. If in the text the email address is right before a full stop, this is now included in the address! So this text gives me problems:
Please email us at: some@example.com. You can also mail us at some@example.co.uk. Etc...
Is there a way to exclude this last .
if it is followed by either a blank space or a line break?
ps.
I do not need to validate email addresses, I need to make sure my expression knows where an email address (or multiple) are in a text and when they stop.
php regex
php regex
edited Nov 22 '18 at 22:17
Dirk J. Faber
asked Nov 22 '18 at 21:00
Dirk J. FaberDirk J. Faber
1,3561317
1,3561317
1
You could for example change your regex to([a-z0-9_.-]+)@((?:[da-z.-]+).)+([a-z]{2,6})
demo. That will repeat the part after the @ sign including the first dot 1+ times. Then omit the dot in the last part.
– The fourth bird
Nov 22 '18 at 21:06
1
@Jeff - That regexe doesn't allow emails with foreign characters, dashes or numbers, like:hello@åä-ö.com
, whileåä-ö.com
actually is a valid domain. You should also be able to have dots in the name-part. When matching email addresses (and URL's), you shouldn't be too strict.
– Magnus Eriksson
Nov 22 '18 at 21:15
1
I had them in there:..@[A-Za-z0-9_-]..
- but of course I forgot about subdomains likespam@sub-domain.my-host.com
– Jeff
Nov 22 '18 at 21:23
1
@DirkJ.Faber absolutely right. Maybe take Magnus' comments about mine (for `ä') into that aswell.
– Jeff
Nov 22 '18 at 21:25
1
Basically, look for a good already made regex for this online. Trying to do it yourself is usually really painful. If you check regexes that takes most rules into account, they are huge...
– Magnus Eriksson
Nov 22 '18 at 21:26
|
show 13 more comments
1
You could for example change your regex to([a-z0-9_.-]+)@((?:[da-z.-]+).)+([a-z]{2,6})
demo. That will repeat the part after the @ sign including the first dot 1+ times. Then omit the dot in the last part.
– The fourth bird
Nov 22 '18 at 21:06
1
@Jeff - That regexe doesn't allow emails with foreign characters, dashes or numbers, like:hello@åä-ö.com
, whileåä-ö.com
actually is a valid domain. You should also be able to have dots in the name-part. When matching email addresses (and URL's), you shouldn't be too strict.
– Magnus Eriksson
Nov 22 '18 at 21:15
1
I had them in there:..@[A-Za-z0-9_-]..
- but of course I forgot about subdomains likespam@sub-domain.my-host.com
– Jeff
Nov 22 '18 at 21:23
1
@DirkJ.Faber absolutely right. Maybe take Magnus' comments about mine (for `ä') into that aswell.
– Jeff
Nov 22 '18 at 21:25
1
Basically, look for a good already made regex for this online. Trying to do it yourself is usually really painful. If you check regexes that takes most rules into account, they are huge...
– Magnus Eriksson
Nov 22 '18 at 21:26
1
1
You could for example change your regex to
([a-z0-9_.-]+)@((?:[da-z.-]+).)+([a-z]{2,6})
demo. That will repeat the part after the @ sign including the first dot 1+ times. Then omit the dot in the last part.– The fourth bird
Nov 22 '18 at 21:06
You could for example change your regex to
([a-z0-9_.-]+)@((?:[da-z.-]+).)+([a-z]{2,6})
demo. That will repeat the part after the @ sign including the first dot 1+ times. Then omit the dot in the last part.– The fourth bird
Nov 22 '18 at 21:06
1
1
@Jeff - That regexe doesn't allow emails with foreign characters, dashes or numbers, like:
hello@åä-ö.com
, while åä-ö.com
actually is a valid domain. You should also be able to have dots in the name-part. When matching email addresses (and URL's), you shouldn't be too strict.– Magnus Eriksson
Nov 22 '18 at 21:15
@Jeff - That regexe doesn't allow emails with foreign characters, dashes or numbers, like:
hello@åä-ö.com
, while åä-ö.com
actually is a valid domain. You should also be able to have dots in the name-part. When matching email addresses (and URL's), you shouldn't be too strict.– Magnus Eriksson
Nov 22 '18 at 21:15
1
1
I had them in there:
..@[A-Za-z0-9_-]..
- but of course I forgot about subdomains like spam@sub-domain.my-host.com
– Jeff
Nov 22 '18 at 21:23
I had them in there:
..@[A-Za-z0-9_-]..
- but of course I forgot about subdomains like spam@sub-domain.my-host.com
– Jeff
Nov 22 '18 at 21:23
1
1
@DirkJ.Faber absolutely right. Maybe take Magnus' comments about mine (for `ä') into that aswell.
– Jeff
Nov 22 '18 at 21:25
@DirkJ.Faber absolutely right. Maybe take Magnus' comments about mine (for `ä') into that aswell.
– Jeff
Nov 22 '18 at 21:25
1
1
Basically, look for a good already made regex for this online. Trying to do it yourself is usually really painful. If you check regexes that takes most rules into account, they are huge...
– Magnus Eriksson
Nov 22 '18 at 21:26
Basically, look for a good already made regex for this online. Trying to do it yourself is usually really painful. If you check regexes that takes most rules into account, they are huge...
– Magnus Eriksson
Nov 22 '18 at 21:26
|
show 13 more comments
1 Answer
1
active
oldest
votes
You may use
/[p{L}0-9_.-]+@[0-9p{L}.-]+.[a-z.]{2,6}b/u
See the regex demo. Or, to only start matching from a letter or digit:
/[p{L}0-9][p{L}0-9_.-]*@[0-9p{L}.-]+.[a-z.]{2,6}b/u
p{L}
will match all Unicode base letters (add p{M}
if you need to also match diacritics, though I doubt there are any here) and add a word boundary at the end to stop before a dot. Remove all unnecessary groupings that you are not using.
See the PHP demo:
$re = '/[p{L}0-9_.-]+@[0-9p{L}.-]+.[a-z.]{2,6}b/u';
$str = 'Please email us at: some@example.com. You can also mail us at some@example.co.uk. Etc... hello@åä-ö.com
example@so.il.uk';
if (preg_match_all($re, $str, $matches)) {
print_r($matches[0]);
}
Output:
Array
(
[0] => some@example.com
[1] => some@example.co.uk
[2] => hello@åä-ö.com
[3] => example@so.il.uk
)
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53437933%2fphp-regex-email-addresses-from-text-sometimes-right-before-a-full-stop%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You may use
/[p{L}0-9_.-]+@[0-9p{L}.-]+.[a-z.]{2,6}b/u
See the regex demo. Or, to only start matching from a letter or digit:
/[p{L}0-9][p{L}0-9_.-]*@[0-9p{L}.-]+.[a-z.]{2,6}b/u
p{L}
will match all Unicode base letters (add p{M}
if you need to also match diacritics, though I doubt there are any here) and add a word boundary at the end to stop before a dot. Remove all unnecessary groupings that you are not using.
See the PHP demo:
$re = '/[p{L}0-9_.-]+@[0-9p{L}.-]+.[a-z.]{2,6}b/u';
$str = 'Please email us at: some@example.com. You can also mail us at some@example.co.uk. Etc... hello@åä-ö.com
example@so.il.uk';
if (preg_match_all($re, $str, $matches)) {
print_r($matches[0]);
}
Output:
Array
(
[0] => some@example.com
[1] => some@example.co.uk
[2] => hello@åä-ö.com
[3] => example@so.il.uk
)
add a comment |
You may use
/[p{L}0-9_.-]+@[0-9p{L}.-]+.[a-z.]{2,6}b/u
See the regex demo. Or, to only start matching from a letter or digit:
/[p{L}0-9][p{L}0-9_.-]*@[0-9p{L}.-]+.[a-z.]{2,6}b/u
p{L}
will match all Unicode base letters (add p{M}
if you need to also match diacritics, though I doubt there are any here) and add a word boundary at the end to stop before a dot. Remove all unnecessary groupings that you are not using.
See the PHP demo:
$re = '/[p{L}0-9_.-]+@[0-9p{L}.-]+.[a-z.]{2,6}b/u';
$str = 'Please email us at: some@example.com. You can also mail us at some@example.co.uk. Etc... hello@åä-ö.com
example@so.il.uk';
if (preg_match_all($re, $str, $matches)) {
print_r($matches[0]);
}
Output:
Array
(
[0] => some@example.com
[1] => some@example.co.uk
[2] => hello@åä-ö.com
[3] => example@so.il.uk
)
add a comment |
You may use
/[p{L}0-9_.-]+@[0-9p{L}.-]+.[a-z.]{2,6}b/u
See the regex demo. Or, to only start matching from a letter or digit:
/[p{L}0-9][p{L}0-9_.-]*@[0-9p{L}.-]+.[a-z.]{2,6}b/u
p{L}
will match all Unicode base letters (add p{M}
if you need to also match diacritics, though I doubt there are any here) and add a word boundary at the end to stop before a dot. Remove all unnecessary groupings that you are not using.
See the PHP demo:
$re = '/[p{L}0-9_.-]+@[0-9p{L}.-]+.[a-z.]{2,6}b/u';
$str = 'Please email us at: some@example.com. You can also mail us at some@example.co.uk. Etc... hello@åä-ö.com
example@so.il.uk';
if (preg_match_all($re, $str, $matches)) {
print_r($matches[0]);
}
Output:
Array
(
[0] => some@example.com
[1] => some@example.co.uk
[2] => hello@åä-ö.com
[3] => example@so.il.uk
)
You may use
/[p{L}0-9_.-]+@[0-9p{L}.-]+.[a-z.]{2,6}b/u
See the regex demo. Or, to only start matching from a letter or digit:
/[p{L}0-9][p{L}0-9_.-]*@[0-9p{L}.-]+.[a-z.]{2,6}b/u
p{L}
will match all Unicode base letters (add p{M}
if you need to also match diacritics, though I doubt there are any here) and add a word boundary at the end to stop before a dot. Remove all unnecessary groupings that you are not using.
See the PHP demo:
$re = '/[p{L}0-9_.-]+@[0-9p{L}.-]+.[a-z.]{2,6}b/u';
$str = 'Please email us at: some@example.com. You can also mail us at some@example.co.uk. Etc... hello@åä-ö.com
example@so.il.uk';
if (preg_match_all($re, $str, $matches)) {
print_r($matches[0]);
}
Output:
Array
(
[0] => some@example.com
[1] => some@example.co.uk
[2] => hello@åä-ö.com
[3] => example@so.il.uk
)
answered Nov 22 '18 at 22:16
Wiktor StribiżewWiktor Stribiżew
325k16146226
325k16146226
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53437933%2fphp-regex-email-addresses-from-text-sometimes-right-before-a-full-stop%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
You could for example change your regex to
([a-z0-9_.-]+)@((?:[da-z.-]+).)+([a-z]{2,6})
demo. That will repeat the part after the @ sign including the first dot 1+ times. Then omit the dot in the last part.– The fourth bird
Nov 22 '18 at 21:06
1
@Jeff - That regexe doesn't allow emails with foreign characters, dashes or numbers, like:
hello@åä-ö.com
, whileåä-ö.com
actually is a valid domain. You should also be able to have dots in the name-part. When matching email addresses (and URL's), you shouldn't be too strict.– Magnus Eriksson
Nov 22 '18 at 21:15
1
I had them in there:
..@[A-Za-z0-9_-]..
- but of course I forgot about subdomains likespam@sub-domain.my-host.com
– Jeff
Nov 22 '18 at 21:23
1
@DirkJ.Faber absolutely right. Maybe take Magnus' comments about mine (for `ä') into that aswell.
– Jeff
Nov 22 '18 at 21:25
1
Basically, look for a good already made regex for this online. Trying to do it yourself is usually really painful. If you check regexes that takes most rules into account, they are huge...
– Magnus Eriksson
Nov 22 '18 at 21:26