[FLINK-8764] [docs] Adjust quickstart documentation

PomeloHwang · Feb 26, 2018 · 647c552 · 647c552
1 parent c6f8406
commit 647c552
Show file tree

Hide file tree

Showing 2 changed files with 52 additions and 162 deletions.
diff --git a/docs/quickstart/java_api_quickstart.md b/docs/quickstart/java_api_quickstart.md
@@ -1,6 +1,6 @@
 ---
-title: "Sample Project using the Java API"
-nav-title: Sample Project in Java
+title: "Project Template for Java"
+nav-title: Project Template for Java
 nav-parent_id: start
 nav-pos: 0
 ---
@@ -86,120 +86,51 @@ quickstart/
         │       └── myorg
         │           └── quickstart
         │               ├── BatchJob.java
-        │               ├── SocketTextStreamWordCount.java
-        │               ├── StreamingJob.java
-        │               └── WordCount.java
+        │               └── StreamingJob.java
         └── resources
             └── log4j.properties
 {% endhighlight %}
 
-The sample project is a __Maven project__, which contains four classes. _StreamingJob_ and _BatchJob_ are basic skeleton programs, _SocketTextStreamWordCount_ is a working streaming example and _WordCountJob_ is a working batch example. Please note that the _main_ method of all classes allow you to start Flink in a development/testing mode.
+The sample project is a __Maven project__, which contains two classes: _StreamingJob_ and _BatchJob_ are the basic skeleton programs for a *DataStream* and *DataSet* program.
+The _main_ method is the entry point of the program, both for in-IDE testing/execution and for proper deployments.
 
 We recommend you __import this project into your IDE__ to develop and
-test it. If you use Eclipse, the [m2e plugin](http://www.eclipse.org/m2e/)
+test it. IntelliJ IDEA supports Maven projects out of the box.
+If you use Eclipse, the [m2e plugin](http://www.eclipse.org/m2e/)
 allows to [import Maven projects](http://books.sonatype.com/m2eclipse-book/reference/creating-sect-importing-projects.html#fig-creating-import).
 Some Eclipse bundles include that plugin by default, others require you
-to install it manually. The IntelliJ IDE supports Maven projects out of
-the box.
+to install it manually. 
 
-
-*A note to Mac OS X users*: The default JVM heapsize for Java is too
+*A note to Mac OS X users*: The default JVM heapsize for Java mey be too
 small for Flink. You have to manually increase it. In Eclipse, choose
 `Run Configurations -> Arguments` and write into the `VM Arguments`
 box: `-Xmx800m`.
 
 ## Build Project
 
-If you want to __build your project__, go to your project directory and
-issue the `mvn clean install -Pbuild-jar` command. You will
-__find a jar__ that runs on every Flink cluster with a compatible
-version, __target/original-your-artifact-id-your-version.jar__. There
-is also a fat-jar in __target/your-artifact-id-your-version.jar__ which,
-additionally, contains all dependencies that were added to the Maven
-project.
+If you want to __build/package your project__, go to your project directory and
+run the '`mvn clean package`' command.
+You will __find a JAR file__ that contains your application, plus connectors and libraries
+that you may have added as dependencoes to the application: `target/<artifact-id>-<version>.jar`.
+
+__Note:__ If you use a different class than *StreamingJob* as the application's main class / entry point,
+we recommend you change the `mainClass` setting in the `pom.xml` file accordingly. That way, the Flink
+can run time application from the JAR file without additionally specifying the main class.
 
 ## Next Steps
 
 Write your application!
 
-The quickstart project contains a `WordCount` implementation, the
-"Hello World" of Big Data processing systems. The goal of `WordCount`
-is to determine the frequencies of words in a text, e.g., how often do
-the terms "the" or "house" occur in all Wikipedia texts.
-
-__Sample Input__:
-
-~~~bash
-big data is big
-~~~
-
-__Sample Output__:
-
-~~~bash
-big 2
-data 1
-is 1
-~~~
-
-The following code shows the `WordCount` implementation from the
-Quickstart which processes some text lines with two operators (a FlatMap
-and a Reduce operation via aggregating a sum), and prints the resulting
-words and counts to std-out.
-
-~~~java
-public class WordCount {
-
-  public static void main(String[] args) throws Exception {
-
-    // set up the execution environment
-    final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
-
-    // get input data
-    DataSet<String> text = env.fromElements(
-        "To be, or not to be,--that is the question:--",
-        "Whether 'tis nobler in the mind to suffer",
-        "The slings and arrows of outrageous fortune",
-        "Or to take arms against a sea of troubles,"
-        );
-
-    DataSet<Tuple2<String, Integer>> counts =
-        // split up the lines in pairs (2-tuples) containing: (word,1)
-        text.flatMap(new LineSplitter())
-        // group by the tuple field "0" and sum up tuple field "1"
-        .groupBy(0)
-        .sum(1);
-
-    // execute and print result
-    counts.print();
-  }
-}
-~~~
-
-The operations are defined by specialized classes, here the LineSplitter class.
-
-~~~java
-public static final class LineSplitter implements FlatMapFunction<String, Tuple2<String, Integer>> {
-
-  @Override
-  public void flatMap(String value, Collector<Tuple2<String, Integer>> out) {
-    // normalize and split the line
-    String[] tokens = value.toLowerCase().split("\\W+");
-
-    // emit the pairs
-    for (String token : tokens) {
-      if (token.length() > 0) {
-        out.collect(new Tuple2<String, Integer>(token, 1));
-      }
-    }
-  }
-}
-~~~
-
-{% gh_link /flink-examples/flink-examples-batch/src/main/java/org/apache/flink/examples/java/wordcount/WordCount.java "Check GitHub" %} for the full example code.
-
-For a complete overview over our API, have a look at the
+If you are writing a streaming application and you are looking for inspiration what to write,
+take a look at the [Stream Processing Application Tutorial]({{ site.baseurl }}/quickstart/run_example_quickstart.html#writing-a-flink-program)
+
+If you are writing a batch processing application and you are looking for inspiration what to write,
+take a look at the [Batch Application Examples]({{ site.baseurl }}/dev/batch/examples.html)
+
+For a complete overview over the APIa, have a look at the
 [DataStream API]({{ site.baseurl }}/dev/datastream_api.html) and
 [DataSet API]({{ site.baseurl }}/dev/batch/index.html) sections.
+
 If you have any trouble, ask on our
 [Mailing List](http://mail-archives.apache.org/mod_mbox/flink-user/).
 We are happy to provide help.

diff --git a/docs/quickstart/scala_api_quickstart.md b/docs/quickstart/scala_api_quickstart.md
@@ -1,6 +1,6 @@
 ---
-title: "Sample Project using the Scala API"
-nav-title: Sample Project in Scala
+title: "Project Template for Scala"
+nav-title: Project Template for Scala
 nav-parent_id: start
 nav-pos: 1
 ---
@@ -173,14 +173,18 @@ quickstart/
                 └── myorg
                     └── quickstart
                         ├── BatchJob.scala
-                        ├── SocketTextStreamWordCount.scala
-                        ├── StreamingJob.scala
-                        └── WordCount.scala
+                        └── StreamingJob.scala
 {% endhighlight %}
 
-The sample project is a __Maven project__, which contains four classes. _StreamingJob_ and _BatchJob_ are basic skeleton programs, _SocketTextStreamWordCount_ is a working streaming example and _WordCountJob_ is a working batch example. Please note that the _main_ method of all classes allow you to start Flink in a development/testing mode.
+The sample project is a __Maven project__, which contains two classes: _StreamingJob_ and _BatchJob_ are the basic skeleton programs for a *DataStream* and *DataSet* program.
+The _main_ method is the entry point of the program, both for in-IDE testing/execution and for proper deployments.
 
-We recommend you __import this project into your IDE__. For Eclipse, you need the following plugins, which you can install from the provided Eclipse Update Sites:
+We recommend you __import this project into your IDE__.
+
+IntelliJ IDEA supports Maven out of the box and offers a plugin for Scala development.
+From our experience, IntelliJ provides the best experience for developing Flink applications.
+
+For Eclipse, you need the following plugins, which you can install from the provided Eclipse Update Sites:
 
 * _Eclipse 4.x_
   * [Scala IDE](http://download.scala-ide.org/sdk/lithium/e44/scala211/stable/site)
@@ -191,78 +195,33 @@ We recommend you __import this project into your IDE__. For Eclipse, you need th
   * [m2eclipse-scala](http://alchim31.free.fr/m2e-scala/update-site)
   * [Build Helper Maven Plugin](https://repository.sonatype.org/content/repositories/forge-sites/m2e-extras/0.14.0/N/0.14.0.201109282148/)
 
-The IntelliJ IDE supports Maven out of the box and offers a plugin for
-Scala development.
+### Build Project
 
+If you want to __build/package your project__, go to your project directory and
+run the '`mvn clean package`' command.
+You will __find a JAR file__ that contains your application, plus connectors and libraries
+that you may have added as dependencoes to the application: `target/<artifact-id>-<version>.jar`.
 
-### Build Project
+__Note:__ If you use a different class than *StreamingJob* as the application's main class / entry point,
+we recommend you change the `mainClass` setting in the `pom.xml` file accordingly. That way, the Flink
+can run time application from the JAR file without additionally specifying the main class.
 
-If you want to __build your project__, go to your project directory and
-issue the `mvn clean package -Pbuild-jar` command. You will
-__find a jar__ that runs on every Flink cluster with a compatible
-version, __target/original-your-artifact-id-your-version.jar__. There
-is also a fat-jar in  __target/your-artifact-id-your-version.jar__ which,
-additionally, contains all dependencies that were added to the Maven
-project.
 
 ## Next Steps
 
 Write your application!
 
-The quickstart project contains a `WordCount` implementation, the
-"Hello World" of Big Data processing systems. The goal of `WordCount`
-is to determine the frequencies of words in a text, e.g., how often do
-the terms "the" or "house" occur in all Wikipedia texts.
-
-__Sample Input__:
-
-~~~bash
-big data is big
-~~~
+If you are writing a streaming application and you are looking for inspiration what to write,
+take a look at the [Stream Processing Application Tutorial]({{ site.baseurl }}/quickstart/run_example_quickstart.html#writing-a-flink-program)
 
-__Sample Output__:
-
-~~~bash
-big 2
-data 1
-is 1
-~~~
-
-The following code shows the `WordCount` implementation from the
-Quickstart which processes some text lines with two operators (a FlatMap
-and a Reduce operation via aggregating a sum), and prints the resulting
-words and counts to std-out.
-
-~~~scala
-object WordCountJob {
-  def main(args: Array[String]) {
-
-    // set up the execution environment
-    val env = ExecutionEnvironment.getExecutionEnvironment
-
-    // get input data
-    val text = env.fromElements("To be, or not to be,--that is the question:--",
-      "Whether 'tis nobler in the mind to suffer", "The slings and arrows of outrageous fortune",
-      "Or to take arms against a sea of troubles,")
-
-    val counts = text.flatMap { _.toLowerCase.split("\\W+") }
-      .map { (_, 1) }
-      .groupBy(0)
-      .sum(1)
-
-    // emit result and print result
-    counts.print()
-  }
-}
-~~~
+If you are writing a batch processing application and you are looking for inspiration what to write,
+take a look at the [Batch Application Examples]({{ site.baseurl }}/dev/batch/examples.html)
 
-{% gh_link flink-examples/flink-examples-batch/src/main/scala/org/apache/flink/examples/scala/wordcount/WordCount.scala "Check GitHub" %} for the full example code.
+For a complete overview over the APIa, have a look at the
+[DataStream API]({{ site.baseurl }}/dev/datastream_api.html) and
+[DataSet API]({{ site.baseurl }}/dev/batch/index.html) sections.
 
-For a complete overview over our API, have a look at the
-[DataStream API]({{ site.baseurl }}/dev/datastream_api.html),
-[DataSet API]({{ site.baseurl }}/dev/batch/index.html), and
-[Scala API Extensions]({{ site.baseurl }}/dev/scala_api_extensions.html)
-sections. If you have any trouble, ask on our
+If you have any trouble, ask on our
 [Mailing List](http://mail-archives.apache.org/mod_mbox/flink-user/).
 We are happy to provide help.