net.html: polish module, update docs (#7193)

2023-08-10 21:13:21 +03:00 · 2020-12-10 03:08:15 +08:00
parent 5fa1e403ec
commit b952bf2e6b
9 changed files with 302 additions and 446 deletions
--- a/vlib/net/html/README.md
+++ b/vlib/net/html/README.md
@ -1,118 +1,16 @@
-# V HTML
-
-A HTML parser made in V.
+net/http is an HTML written in pure V.

 ## Usage
+```v oksyntax
+import net.html

-If the description below isn't enought, please look at the test files.
-
-### Parser
-
-Responsible for read HTML in full strings or splited string and returns all Tag objets of
-it HTML or return a DocumentObjectModel, that will try to find how the HTML Tree is.
-
-#### split_parse(data string)
-This functions is the main function called by parse method to fragment parse your HTML.
-
-#### parse_html(data string, is_file bool)
-This function is called passing a filename or a complete html data string to it.
-
-#### add_code_tag(name string)
-This function is used to add a tag for the parser ignore it's content. 
-For example, if you have an html or XML with a custom tag, like `<script>`, using this function, 
-like `add_code_tag('script')` will make all `script` tags content be jumped, 
-so you still have its content, but will not confuse the parser with it's `>` or `<`.
-
-#### finalize()
-When using **split_parse** method, you must call this function to ends the parse completely.
-
-#### get_tags() []Tag_ptr
-This functions returns a array with all tags and it's content.
-
-#### get_dom() DocumentObjectModel
-Returns the DocumentObjectModel for current parsed tags.
-
-### WARNING
-If you want to reuse parser object to parse another HTML, call `initialize_all()` function first.
-
-### DocumentObjectModel
-
-A DOM object that will make easier to access some tags and search it.
-
-#### get_by_attribute_value(name string, value string) []Tag_ptr
-This function retuns a Tag array with all tags in document 
-that have a attribute with given name and given value.
-
-#### get_by_tag(name string) []Tag_ptr
-This function retuns a Tag array with all tags in document that have a name with the given value.
-
-#### get_by_attribute(name string) []Tag_ptr
-This function retuns a Tag array with all tags in document that have a attribute with given name.
-
-#### get_root() Tag_ptr
-This function returns the root Tag.
-
-#### get_all_tags() []Tag_ptr
-This function returns all important tags, removing close tags.
-
-### Tag
-
-An object that holds tags information, such as `name`, `attributes`, `children`.
-
-#### get_children() []Tag_ptr
-Returns all children as an array.
-
-#### get_parent() &Tag
-Returns the parent of current tag.
-
-#### get_name() string
-Returns tag name.
-
-#### get_content() string
-Returns tag content.
-
-#### get_attributes() map[string]string
-Returns all attributes and it value.
-
-#### text() string
-Returns the content of the tag and all tags inside it. 
-Also, any `<br>` tag will be converted into `\n`.
-
-## Some questions that can appear
-
-### Q: Why in parser have a `builder_str() string` method that returns only the lexeme string?
-    
-A: Because in early stages of the project, `strings.Builder` are used, 
-but for some bug existing somewhere, it was necessary to use `string` directly. 
-Later, it's planned to use `strings.Builder` again.
-
-### Q: Why have a `compare_string(a string, b string) bool` method?
-
-A: For some reason when using != and == in strings directly, it is not working. 
-So this method is a workaround.
-
-### Q: Will be something like `XPath`?
-
-A: Like XPath yes. Exactly equal to it, no.
-
-## Roadmap
- [x] Parser
-  - [x] `<!-- Comments -->` detection
-  - [x] `Open Generic tags` detection
-  - [x] `Close Generic tags` detection
-  - [x] `verify string` detection
-  - [x] `tag attributes` detection
-  - [x] `attributes values` detection
-  - [x] `tag text` (on tag it is declared as content, maybe change for text in the future)
-  - [x] `text file for parse` support (open local files for parsing)
-  - [x] `open_code` verification
- [x] DocumentObjectModel
-  - [x] push elements that have a close tag into stack
-  - [x] remove elements from stack
-  - [x] ~~create a new document root if have some syntax error (deleted)~~
-  - [x] search tags in `DOM` by attributes
-  - [x] search tags in `DOM` by tag type
-  - [x] finish dom test
-
-## License
-[MIT](../../../LICENSE)
+fn main() {
+	doc := html.parse('<html><body><h1 class="title">Hello world!</h1></body></html>')
+	tag := doc.get_tag('h1')[0] // <h1>Hello world!</h1>
+	println(tag.name) // h1
+	println(tag.content) // Hello world!
+	println(tag.attributes) // {'class':'title'}
+	println(tag.str()) // <h1 class="title">Hello world!</h1>
+}
+```
+More examples found on [`parser_test.v`](parser_test.v) and [`html_test.v`](html_test.v)