How to get all <img src> web pages in iOS UIWebView?

all.

I am trying to get all the image urls of the current page in a UIWebView.

So here is my code.

- (void)webViewDidFinishLoad:(UIWebView*)webView {
    NSString *firstImageUrl = [self.webView stringByEvaluatingJavaScriptFromString:@"var images = document.getElementsByTagName('img');images[0].src.toString();"];
    NSString *imageUrls = [self.webView stringByEvaluatingJavaScriptFromString:@"var images= document.getElementsByTagName('img');var imageUrls = "";for(var i = 0; i < images.length; i++){var image = images[i];imageUrls += image.src;imageUrls += \\’,\\’;}imageUrls.toString();"];
    NSLog(@"firstUrl : %@", firstImageUrl);
    NSLog(@"images : %@",imageUrls);
}

1st NSLog returns the correct src image, but the second NSLog returns nothing.

2013-01-25 00:51:23.253 WebDemo[3416:907] firstUrl: https://www.paypalobjects.com/en_US/i/scr/pixel.gif
2013-01-25 00:51:23.254 WebDemo[3416:907] images :

I do not know why. Please help me...

Thank.

+5
source share
4 answers

Perrohunter pointed out one solution NSRegularExpressionthat is excellent. If you do not want to list an array of matches, you can use a enumerateMatchesInStringblock-based method :

NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"(<img\\s[\\s\\S]*?src\\s*?=\\s*?['\"](.*?)['\"][\\s\\S]*?>)+?"
                                                                       options:NSRegularExpressionCaseInsensitive
                                                                         error:&error];

[regex enumerateMatchesInString:yourHTMLSourceCodeString
                        options:0
                          range:NSMakeRange(0, [yourHTMLSourceCodeString length])
                     usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {

                         NSString *img = [yourHTMLSourceCodeString substringWithRange:[result rangeAtIndex:2]];
                         NSLog(@"img src %@",img);
                     }];

I also updated the regex pattern to solve the following problems:

  • attributes may exist between the start tag imgand the attribute src;
  • src >;
  • img (. , );
  • src ', ";
  • src =, = .

, , , , ( JSON Joris, ..). , img, enumerateMatchesInString , matchesInString.

+12

, .

javascript :

// javascript to execute:
(function() {
    var images=document.querySelectorAll("img");
    var imageUrls=[];
    [].forEach.call(images, function(el) {
        imageUrls[imageUrls.length] = el.src;
    }); 
    return JSON.stringify(imageUrls);
})()

, JSON . Objective-C:

NSString *imageURLString = [self.webview stringByEvaluatingJavaScriptFromString:@"(function() {var images=document.querySelectorAll(\"img\");var imageUrls=[];[].forEach.call(images, function(el) { imageUrls[imageUrls.length] = el.src;}); return JSON.stringify(imageUrls);})()"];

// parse json back into an array
NSError *jsonError = nil;
NSArray *urls = [NSJSONSerialization JSONObjectWithData:[imageURLString dataUsingEncoding:NSUTF8StringEncoding] options:0 error:&jsonError];

if (!urls) {
    NSLog(@"JSON error: %@", jsonError);
    return;
}

NSLog(@"Images : %@", urls);
+10

You can achieve this by running regex in the loaded webview html source code

NSString *yourHTMLSourceCodeString = [webView stringByEvaluatingJavaScriptFromString:@"document.body.innerHTML"];

    NSError *error = NULL;
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"(<img src=\"(.*?)\">)+?"
                                                                           options:NSRegularExpressionCaseInsensitive
                                                                             error:&error];

    NSArray *matches = [regex matchesInString:yourHTMLSourceCodeString
                                      options:0
                                        range:NSMakeRange(0, [yourHTMLSourceCodeString length])];

    NSLog(@"total matches %d",[matches count]);

    for (NSTextCheckingResult *match in matches) {
        NSString *img = [yourHTMLSourceCodeString substringWithRange:[match rangeAtIndex:2]] ;
        NSLog(@"img src %@",img);
    }

This is a pretty basic regex that matches anything inside a tag, it will require more details if your images have more attributes like class or id

+6
source

With this html you can use the SwiftSoup library. Using Swift 3

do {
    let doc: Document = try SwiftSoup.parse(html)
    let srcs: Elements = try doc.select("img[src]")
    let srcsStringArray: [String?] = srcs.array().map { try? $0.attr("src").description }
    // do something with srcsStringArray
    } catch Exception.Error(_, let message) {
        print(message)
    } catch {
        print("error")
    }
+2
source

All Articles