i have textfield users can write anything.
for example:
lorem ipsum dummy text. http://www.youtube.com/watch?v=duqi_r4sgwo of printing , typesetting industry. lorem ipsum has been industry's standard dummy text ever since 1500s, when unknown printer took galley of type , scrambled make type specimen book. has survived not 5 centuries, leap electronic typesetting, remaining unchanged. http://www.youtube.com/watch?v=a_6gnzckaju&feature=relmfu popularised in 1960s release of letraset sheets containing lorem ipsum passages, , more desktop publishing software aldus pagemaker including versions of lorem ipsum.
now parse , find youtube video urls , ids.
any idea how works?
a youtube video url may encountered in variety of formats:
- latest short format:
http://youtu.be/nlqaf9hrvby
- iframe:
http://www.youtube.com/embed/nlqaf9hrvby
- iframe (secure):
https://www.youtube.com/embed/nlqaf9hrvby
- object param:
http://www.youtube.com/v/nlqaf9hrvby?fs=1&hl=en_us
- object embed:
http://www.youtube.com/v/nlqaf9hrvby?fs=1&hl=en_us
- watch:
http://www.youtube.com/watch?v=nlqaf9hrvby
- users:
http://www.youtube.com/user/scobleizer#p/u/1/1p3vcrhsygo
- ytscreeningroom:
http://www.youtube.com/ytscreeningroom?v=nrhvzbjvx8i
- any/thing/goes!:
http://www.youtube.com/sandalsresorts#p/c/54b8c800269d7c1b/2/pps-8dmran4
- any/subdomain/too:
http://gdata.youtube.com/feeds/api/videos/nlqaf9hrvby
- more params:
http://www.youtube.com/watch?v=spdj54kf-vy&feature=g-vrec
- query may have dot:
http://www.youtube.com/watch?v=spdj54kf-vy&feature=youtu.be
- nocookie domain:
http://www.youtube-nocookie.com
here php function commented regex matches each of these url forms , converts them links (if not links already):
// linkify youtube urls not links. function linkifyyoutubeurls($text) { $text = preg_replace('~(?#!js youtubeid rev:20160125_1800) # match non-linked youtube url in wild. (rev:20130823) https?:// # required scheme. either http or https. (?:[0-9a-z-]+\.)? # optional subdomain. (?: # group host alternatives. youtu\.be/ # either youtu.be, | youtube # or youtube.com or (?:-nocookie)? # youtube-nocookie.com \.com # followed \s*? # allow video_id, [^\w\s-] # char before id non-id char. ) # end host alternatives. ([\w-]{11}) # $1: video_id 11 chars. (?=[^\w-]|$) # assert next char non-id or eos. (?! # assert url not pre-linked. [?=&+%\w.-]* # allow url (query) remainder. (?: # group pre-linked alternatives. [\'"][^<>]*> # either inside start tag, | </a> # or inside <a> element text contents. ) # end recognized pre-linked alts. ) # end negative lookahead assertion. [?=&+%\w.-]* # consume url (query) remainder. ~ix', '<a href="http://www.youtube.com/watch?v=$1">youtube link: $1</a>', $text); return $text; }
; // end $youtubeid.
and here javascript version exact same regex (with comments removed):
// linkify youtube urls not links. function linkifyyoutubeurls(text) { var re = /https?:\/\/(?:[0-9a-z-]+\.)?(?:youtu\.be\/|youtube(?:-nocookie)?\.com\s*?[^\w\s-])([\w-]{11})(?=[^\w-]|$)(?![?=&+%\w.-]*(?:['"][^<>]*>|<\/a>))[?=&+%\w.-]*/ig; return text.replace(re, '<a href="http://www.youtube.com/watch?v=$1">youtube link: $1</a>'); }
notes:
- the video_id portion of url captured in 1 , capture group:
$1
. - if know text not contain pre-linked urls, can safely remove negative lookahead assertion tests condition (the assertion beginning comment: "assert url not pre-linked.") speed regex somewhat.
- the replace string can modified suit. 1 provided above creates link generic
"http://www.youtube.com/watch?v=video_id"
style url , sets link text to:"youtube link: video_id"
.
edit 2011-07-05: added -
hyphen id char class
edit 2011-07-17: fixed regex consume remaining part (e.g. query) of url following youtube id. added 'i'
ignore-case modifier. renamed function camelcase. improved pre-linked lookahead test.
edit 2011-07-27: added new "user" , "ytscreeningroom" formats of youtube urls.
edit 2011-08-02: simplified/generalized handle new "any/thing/goes" youtube urls.
edit 2011-08-25: several modifications:
- added javascript version of:
linkifyyoutubeurls()
function. - previous version had scheme (http protocol) part optional , match invalid urls. made scheme part required.
- previous version used
\b
word boundary anchor around video_id. however, not work if video_id begins or ends-
dash. fixed handles condition. - changed video_id expression must 11 characters long.
- the previous version failed exclude pre-linked urls if had query string following video_id. improved negative lookahead assertion fix this.
- added
+
,%
character class matching query string. - changed php version regex delimiter from:
%
a:~
. - added "notes" section handy notes.
edit 2011-10-12: youtube url host part may have subdomain (not www.
).
edit 2012-05-01: consume url section may allow '-'.
edit 2013-08-23: added additional format provided @mei. (the query part may have .
dot.
edit 2013-11-30: added additional format provided @cronus: youtube-nocookie.com
.
edit 2016-01-25: fixed regex handle error case provided cronus.
Comments
Post a Comment