Is there a parser for html for hiccup structures?

I am looking for a function that changes clojure hiccup

So

   <html> </html>

turns into

[:html]

and etc.


Following from @kotarak's answer, this now works for me:

(use 'net.cgrand.enlive-html)
(import 'java.io.StringReader)

(defn enlive->hiccup
   [el]
   (if-not (string? el)
     (->> (map enlive->hiccup (:content el))
       (concat [(:tag el) (:attrs el)])
       (keep identity)
       vec)
     el))

(defn html->enlive 
  [html]
  (first (html-resource (StringReader. html))))

(defn html->hiccup [html]
  (-> html
      html->enlive
      enlive->hiccup))

=> (html->hiccup "<html><body id='foo'>hello</body></html>")
[:html [:body {:id "foo"} "hello"]]
+5
source share
3 answers

You can get the following structure html-resourcefrom enlive :

{:tag :html :attrs {} :content []}

Then move it and turn it into a hiccup structure.

(defn html->hiccup
   [html]
   (if-not (string? html)
     (->> (map html->hiccup (:content html))
       (concat [(:tag html) (:attrs html)])
       (keep identity)
       vec)
     html))

Here is a usage example:

user=>  (html->hiccup {:tag     :p
                       :content ["Hello" {:tag     :a
                                          :attrs   {:href "/foo"}
                                          :content ["World"]}
                                 "!"]})
[:p "Hello" [:a {:href "/foo"} "World"] "!"]
+8
source

All Articles