Twisting articles – Chinese SEO
Time is like a gap, from the intern in 2012, I have been in the pit for six years, so I can’t help but sigh, Fanghua has passed away, youth is no longer. After six years of hard work, I have experienced the SEO content stacking era, the SEO outer chain era, and the SEO click era. During the period, I have done large and small websites, and there are daily average traffic from 0 to 1W. Of course, there are often K stations.
This is due to the continuous practice of the predecessors on the premise, such as Zac, Zhang Guoping, Fu Wei, night interest, Zhang Wenyi, Lu Songsong, and peers GOGO闯, etc., the rivers and lakes are big, but it is difficult to trace Xiao Xiaochun, maybe they I don’t know me, but it doesn’t prevent me from learning from good people.
With the continuous advancement of technology, the once brilliant PC era has also been replaced by the mobile phone mobile terminal that has gradually emerged. The Internet is also particularly impetuous, for fear of being abandoned by the times, so that the relatively slow SEO optimization method is becoming less and less Admired by the mainstream, SEOer, in addition to medical, tourism, e-commerce, real estate and other large websites, has maintained its dignity and other industries.
In particular, the position of SEOer in traditional enterprises is also slightly embarrassing, or it is regarded as the “God” by the company. It must be SEO and SEM. After the SEM advertisement clicks on the unit price, then the information flow will be I also need to work part-time to do the advertising flow of information flow, lost myself in the complicated work, often ask myself, I am a sly, or I will not be valued by the company, in a dispensable position.
As a result, SEOers are gradually seeking changes, transitioning to Internet-related positions such as operations, products, new media, and copywriting. Some have found themselves in the transition, while others have become more confused in the transition.
Let’s take a look at the years we have traveled.
SEO content stacking era
The TF-IDF (Word Frequency – Inverse Document Frequency) algorithm is a statistical method used to assess the importance of a word for a file set or one of the files in a corpus. The importance of a word increases proportionally with the number of times it appears in the file, but it also decreases inversely with the frequency it appears in the corpus. The algorithm has been widely used in data mining, text processing and information retrieval, such as finding keywords from an article.
The main idea of TFIDF is that if a word or phrase appears in an article with a high frequency TF and rarely appears in other articles, the word or phrase is considered to have good class distinguishing ability and is suitable for classification. . TF-IDF is actually TF*IDF, where TF(Term Frequency) indicates the frequency at which the entry appears in the article Document; IDF (Inverse Document Frequency).
The main idea is that if there are fewer documents containing a word Word, the more distinguishing the word, the larger the IDF. For how to get the keywords of an article, we can calculate the TF-IDF of all the nouns appearing in this article. The larger the TF-IDF, the higher the distinction between the term and the article. Take TF-IDF. A few words with a larger value can be used as keywords for this article.
Based on the content accumulation of TF-IDF algorithm, various pseudo-original tools are prevailed during this period, and the keyword density is accumulated. The density of the stationmaster’s home is recommended to be 2% to 8%, which is regarded as the industry standard, and there are four words. (The title title, keyword keywords and description description, content, anchor text), so that SEOer in the search caused a lot of water.
SEO external chain era:
PageRank, page rank, also known as page level, Google left ranking or Page ranking, is a technique based on hyperlinks between web pages, and as one of the elements of web page ranking, the founder of Google Named after the surname of Larry Page.
Google uses it to reflect the relevance and importance of web pages, and is often used in search engine optimization to evaluate the effectiveness of web page optimization. Google founder Larry Page and Sergey Brin invented the technology at Stanford University in 1998.
PageRank determines the level of a page through the vast network of hyperlinks. Google interprets the link from page A to page B as the page A to vote for page B. Google determines the new level based on the source of the vote (even the source of the source, ie the page linked to page A) and the level of the voting target. Simply put, a high-level page can raise the level of other low-level pages.
Based on the PageRank algorithm, the era of external links, when multiple browsers and multiple windows are running at the same time, CtrlC plus CtrlV infinite loop, we are affectionately called CV engineers; then dark horse blog group, worm marketing assistant, and swordsman The sword ingests the large traffic pool of search engines; of course, there are a large number of various link exchange techniques and sprocket techniques.
SEO click era:
As the algorithm mechanism of the search engine is more perfect, the weight of the content density and the external chain is gradually weakened, and the user experience is prioritized. Then the click algorithm is born, and the user clicks and the user stay time.
Try to control the number of clicks, and it should be closer to the human clicks (each keyword clicks in 2-5 times), click on the keyword website page from Baidu page, wait 2-10 seconds (page does not close, time to be adjusted) Once again, enter from Baidu search, click on the keyword page to enter the website, wait about 10 seconds, and it is best to click on other links in the page several times. Make sure that the entire process user stays on the site for more than 1 minute.
User stay time:
The best time for the forum is about 3 minutes, and the portal information website is usually around 1 minute to 3 minutes.
Throughout the arguments of the SEO gods, I personally prefer:
SEO Traffic ≈ Search Demand Coverage * Recruitment * Ranking * Click Rate
Therefore, the following four factors will be specifically described.
Practice articles 1. Search demand coverage
Search demand coverage is simple to understand as a thesaurus, that is to say, to establish a thesaurus requirements table for its own industry, then there are several ways to find keywords:
1) Baidu related search
2) Baidu drop-down box
3) Keyword Planner (http://www2.baidu.com/)
4) 5118 (http://www.5118.com/)
5) Thesaurus (http://www.ciku5.com/)
6) Love Station Dictionary (https://ci.aizhan.com/)
7) Sogou input method dictionary (https://pinyin.sogou.com/dict/)
8) Create a tag vocabulary on the ad site
Building a thesaurus requirements table has two roles:
1) Create (acquire) content for the lexicon keyword
2) Keywords of similar attributes (word meaning) constitute aggregate (TAG) page
For example: how to make the skin white _ how to make the skin white _ how the skin becomes white
In fact, the meaning of these words is similar, so that the composition meets the search needs of more people to a certain extent.
Inclusions ≈ Content Quantity * Content Quality
1) Quantity of content
a. Collection method: The quality of the content collected by the collection method is relatively low, but it can win in quantity, and the previously compiled vocabulary is generated according to the principle of similarity of words. The demo idea is as follows:
The first step is to make an entertainment website, search for the word “entertainment” on the search engine, dig out the opponent’s website, and record the website.
The second step is to put the recorded URLs into 5118, love stations, and webmasters to mine the keywords with rankings, and then export these keywords, here 5118 as an example.
The third step, the derived keywords must be messy, then we classify according to the part of speech, select the core words of each keyword, here you need to borrow Python’s third-party library textrank4zh, the code reference is as follows:
Then, after finishing, the results are obtained, including the core words and keywords.
Finally, through the VB tool, the final result, the same core word will be displayed in a column, then the keyword of this column, we can think of the same part of speech word, can be used to constitute the same tag (TAG) page.
The fourth step is to collect the content according to the keywords of the tab page. It is recommended to select the keyword with the least number of characters in the same tab page. The channel for collecting content can be today’s headline, major news websites, or information-based APP. You can open the code in “Catch 10W data, analyze 1W explosions, and write 10W reading content”.
The fifth step, after collecting the content, then you can build a localized search engine, such as fire search, xunsearch, etc., and then import the content into the local search engine, here xunsearch, for example, virtual machine new Linux system, build xunsearch System, the specific installation can refer to the “xunsearch installation steps”, the resulting search engine is as shown below, then we can put our target keywords in search engine search.
The sixth step, search in the virtual machine, the efficiency is very low, then you can use python, use the virtual machine IP as the URL, the specific code is as shown below, then you can get the article corresponding to the tag keyword, generate the corresponding tab page.
b. Part-time law: staffing in the forums, post bars, and QQ groups of major universities to publish and recruit part-time information, set up part-time teams, assign relevant keywords to part-time personnel, charge according to the article, and use the red rate as the content quality judgment standard. The lower the redness rate, the higher the content quality from the perspective of the search engine, and the part-time method is more controllable in terms of content quality.
If you have a development capability company, it is recommended to build an article review system and upload the keywords you need to do to the system. Part-time staff can select keywords by themselves, upload the content to the system after writing the content, and the system picks up the random content of the article. The engine goes to judge the redness rate.
When the redness rate reaches a certain value you set, it will be automatically posted to the website, and part-time will enter the settlement payment section, otherwise it will be rejected. This will greatly save labor costs.
2) Content quality
After talking about the amount of content, let’s look at how to build high-quality content, high-quality content needs to meet user needs, such as the keyword “Liu Yifei”, here you need to use the “Baidu Index” and “Baidu know” two tools.
Open the demand map of the Baidu Index, you can generally see the search for “Liu Yifei”, may pay attention to Liu Yifei Weibo, Liu Yifei movies, Liu Yifei Song Chengxian, Liu Yifei wedding photos, etc., then you can write content according to these needs
Open Baidu to know, you will find that users are more concerned about Liu Yifei’s problems, for these issues, you can also write content.
After the search coverage and the amount of search demand, the next step is to rank, although the proportion of search engines for the external chain is now reduced, but for high-quality links, it is still very important, and currently it can operate the external chain platform. Less and less, many forums can not be linked, only a small number of stations can bring links, like Sina blog, Netease blog, look forward to more digging. In addition to your own hair chain, there are several ways to do this:
1) Do the inner chain
2) For friendship links, not limited to the home page, there are list pages, content pages, etc. If you are a big station, you can also change according to various categories and cities (friendship links exchange exchange maintenance).
3) Buy a link, you may wish to buy a link if you have sufficient funds.
4) Sweeping holes, adding black chains (use with caution, breaking the law). Some people use some open source CMS vulnerabilities to scan the website backend and account passwords to add black chains.
4. Click rate
Assume that in the case of no brush click, how to improve the click rate, of course, Baidu Thunder algorithm is also hitting the click behavior.
1) Title optimization, generally with free, XX days to learn the word title is relatively attractive to the eye, to a certain extent can improve the click rate
2) Pictures and pictures, the picture pixels are above 800px*800px, which can effectively improve the picture rate.
3) For a strong brand, it is recommended to be an official website certification.
Extravagant Spider Pool:
Spider pool is to use multiple server and site domain names, use a regular content site to raise a large number of spiders, provide a large amount of content for spiders to crawl every day, when some links are needed, submit these links to the server into the spider pool, There will be a lot of spiders crawling these urls quickly.
At present, the spider pool still has some help for the inclusion. If it is a million page level site, you can consider using a spider pool to improve the inclusion rate, but the cost of the spider pool is not low.
Spider pool program: 2000 or so
258ip server: about 1000/month
Domain name: 20 or so, 500 domain names from
This kind of a large number of long tail keywords are pushed to Baidu for collection and ranking by using a large platform, which is very good for attracting spiders. Of course, it can also be used for monthly experience. At present, familiar friends know that Songsong Mall has added hundreds of domain name packages and the effect is good. Click to enter.
The station group usually consists of several to several hundred websites, and the simplest understanding of the station group is a group of websites. And these sites belong to one person. Then these sites are called the station group of this webmaster. In the past, the station group used more station group programs to generate sites in batches. The effect of these station groups was relatively poor.
If it is a traditional enterprise, the keyword that is not very competitive is considered to be a refined station group. One keyword corresponds to one website, one server does 5 domain names, and the domain name resolves to the directory page, which is equivalent to a background program. There are 5 websites.
If you have 4 servers, it is equivalent to 20 websites. For a traditional enterprise, there are 20 websites that can operate a lot of space. It is very advantageous to change the friend chain and do the inner chain. It’s easy to do it, and to a certain extent, it can dominate the industry keywords of a small industry.
January 25, 2017