How to get all <img src> web pages in iOS UIWebView?
all.
I am trying to get all the image urls of the current page in a UIWebView.
So here is my code.
- (void)webViewDidFinishLoad:(UIWebView*)webView {
NSString *firstImageUrl = [self.webView stringByEvaluatingJavaScriptFromString:@"var images = document.getElementsByTagName('img');images[0].src.toString();"];
NSString *imageUrls = [self.webView stringByEvaluatingJavaScriptFromString:@"var images= document.getElementsByTagName('img');var imageUrls = "";for(var i = 0; i < images.length; i++){var image = images[i];imageUrls += image.src;imageUrls += \\β,\\β;}imageUrls.toString();"];
NSLog(@"firstUrl : %@", firstImageUrl);
NSLog(@"images : %@",imageUrls);
}
1st NSLog returns the correct src image, but the second NSLog returns nothing.
2013-01-25 00:51:23.253 WebDemo[3416:907] firstUrl: https://www.paypalobjects.com/en_US/i/scr/pixel.gif
2013-01-25 00:51:23.254 WebDemo[3416:907] images :
I do not know why. Please help me...
Thank.
Perrohunter pointed out one solution NSRegularExpressionthat is excellent. If you do not want to list an array of matches, you can use a enumerateMatchesInStringblock-based method :
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"(<img\\s[\\s\\S]*?src\\s*?=\\s*?['\"](.*?)['\"][\\s\\S]*?>)+?"
options:NSRegularExpressionCaseInsensitive
error:&error];
[regex enumerateMatchesInString:yourHTMLSourceCodeString
options:0
range:NSMakeRange(0, [yourHTMLSourceCodeString length])
usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
NSString *img = [yourHTMLSourceCodeString substringWithRange:[result rangeAtIndex:2]];
NSLog(@"img src %@",img);
}];
I also updated the regex pattern to solve the following problems:
- attributes may exist between the start tag
imgand the attributesrc; src>;img(., );src',";src=,=.
, , , , ( JSON Joris, ..). , img, enumerateMatchesInString , matchesInString.
, .
javascript :
// javascript to execute:
(function() {
var images=document.querySelectorAll("img");
var imageUrls=[];
[].forEach.call(images, function(el) {
imageUrls[imageUrls.length] = el.src;
});
return JSON.stringify(imageUrls);
})()
, JSON . Objective-C:
NSString *imageURLString = [self.webview stringByEvaluatingJavaScriptFromString:@"(function() {var images=document.querySelectorAll(\"img\");var imageUrls=[];[].forEach.call(images, function(el) { imageUrls[imageUrls.length] = el.src;}); return JSON.stringify(imageUrls);})()"];
// parse json back into an array
NSError *jsonError = nil;
NSArray *urls = [NSJSONSerialization JSONObjectWithData:[imageURLString dataUsingEncoding:NSUTF8StringEncoding] options:0 error:&jsonError];
if (!urls) {
NSLog(@"JSON error: %@", jsonError);
return;
}
NSLog(@"Images : %@", urls);
You can achieve this by running regex in the loaded webview html source code
NSString *yourHTMLSourceCodeString = [webView stringByEvaluatingJavaScriptFromString:@"document.body.innerHTML"];
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"(<img src=\"(.*?)\">)+?"
options:NSRegularExpressionCaseInsensitive
error:&error];
NSArray *matches = [regex matchesInString:yourHTMLSourceCodeString
options:0
range:NSMakeRange(0, [yourHTMLSourceCodeString length])];
NSLog(@"total matches %d",[matches count]);
for (NSTextCheckingResult *match in matches) {
NSString *img = [yourHTMLSourceCodeString substringWithRange:[match rangeAtIndex:2]] ;
NSLog(@"img src %@",img);
}
This is a pretty basic regex that matches anything inside a tag, it will require more details if your images have more attributes like class or id
With this html you can use the SwiftSoup library. Using Swift 3
do {
let doc: Document = try SwiftSoup.parse(html)
let srcs: Elements = try doc.select("img[src]")
let srcsStringArray: [String?] = srcs.array().map { try? $0.attr("src").description }
// do something with srcsStringArray
} catch Exception.Error(_, let message) {
print(message)
} catch {
print("error")
}