java - Extract text in a order using jsoup -
i want extract text inside "job title" , text inside "summary" class. there many same class names. want job title of first 1 , summary of it. , job title of next 1 , summary of it. in order.
the following code works. first gives titles , text inside summary classes. want first job title , first summary. second job title , second summary , on. how modify code this? please help.
<div class=" row result" id="p_64c5268586001bd2" data-jk="64c5268586001bd2" itemscope="" itemtype="http://schema.org/jobposting" data-tn-component="organicjob"> <h2 id="jl_64c5268586001bd2" class="jobtitle"> <a rel="nofollow" href="/rc/clk?jk=64c5268586001bd2" target="_blank" onmousedown="return rclk(this,jobmap[0],0);" onclick="return rclk(this,jobmap[0],true,0);" itemprop="title" title="fashion assistant" class="turnstilelink" data-tn-element="jobtitle"><b>fashion</b> assistant</a> </h2> <span class="company" itemprop="hiringorganization" itemtype="http://schema.org/organization"> <span itemprop="name"> <a href="/cmp/itv?from=serp&campaignid=serp-linkcompanyname&fromjk=64c5268586001bd2&jcid=3bf3e8a57da58ff5" target="_blank"> itv jobs</a></span> </span> <a data-tn-element="reviewstars" data-tn-variant="cmplinktst2" class="turnstilelink " href="/cmp/itv/reviews?jcid=3bf3e8a57da58ff5" title="itv jobs reviews" onmousedown="this.href = appendparamsonce(this.href, '?campaignid=cmplinktst2&from=serp&jt=fashion+assistant&fromjk=64c5268586001bd2');" target="_blank"> <span class="ratings"><span class="rating" style="width:49.5px;"><!-- -> </span></span><span class="slnounderline">28 reviews</span></a> <span itemprop="joblocation" itemscope="" itemtype="http://schema.org/place"> <span class="location" itemprop="address" itemscope="" itemtype="http://schema.org/postaladdress"><span itemprop="addresslocality">london</span></span></span> <table cellpadding="0" cellspacing="0" border="0"> <tbody><tr> <td class="snip"> <div> <span class="summary" itemprop="description"> have passion <b>fashion</b>? responsible running our <b>fashion</b> cupboard, managing team of interns , liaising press officers to...</span> </div>
doc = jsoup.connect("http://www.indeed.co.uk/jobs?q=fashion&l=england").timeout(5000).get(); elements f = doc.select(".jobtitle"); elements e = doc.select(".summary"); system.out.println("title: " + f.text()); system.out.println("details: "+ e.text());
iterate on titles , find summary each title:
for (element title : doc.select(".jobtitle")) { element summary = title.parent().select(".summary").first(); system.out.format("title: %s. summary: %s%n", title.text(), summary.text()); }
Comments
Post a Comment