Skip to content
This repository was archived by the owner on May 12, 2021. It is now read-only.

Conversation

@erandiganepola
Copy link
Contributor

This PR contains the REST server implementation for VXQuery-180. It has been written on top of hyracks-http.

I will be adding more test cases along with tests for complex queries in the coming week. Please review my code and provide me with your valuable feedback.

Thank you!

@prestoncarman
Copy link
Contributor

Is there any documentation that needs to be updated? Such as managing a cluster: http://vxquery.apache.org/user_cluster_installation.html
Also should we create an API page to document the end points?

@prestoncarman
Copy link
Contributor

Can you explain how this project interacts with vxquery-server? Will the server start the REST api?

@erandiganepola
Copy link
Contributor Author

Hi @prestoncarman! At the moment, users have to setup the cluster separately and give the cluster controller's ip and port in a properties file.

But I'm currently going through the AsterixDB's CCApplication class to adapt a similar thing to VXQueryApplication class.

Your suggestions on this will greatly help me to come up with a solution.

Thank you!

@prestoncarman
Copy link
Contributor

I think the user should have a single command/script to start the Cluster (Hyracks - cluster controller and all data nodes and the REST service). It would be nice if all the setting for the cluster were in a single properties file. I would think that vxquery-server project would be responsible for starting the REST service after the cluster controller has been started. This may require some refactoring. Does this make sense?

Copy link
Contributor

@prestoncarman prestoncarman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a few comments. I did not look to closely at the REST API specific code. Ian can give you feedback on REST specific code. Your making great progress.

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<hyracks.fullstack.version>0.3.1</hyracks.fullstack.version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this version is temporary until we can move VXQuery to the next Hyracks version. Is that true? It would be nice to be dependent on a single Hyracks version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes of course @prestoncarman. This will be removed when VXQuery moves to next Hyracks version (0.3.1 - because this is the only version having hyracks-http)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After merging #168, this can be removed I guess.

<artifactId>apache-vxquery-rest</artifactId>

<dependencies>
<dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sort the dependencies in alphabetical order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prestoncarman sure. I will do that


public void stop() {
try {
LOGGER.log(Level.CONFIG, "Stopping rest server");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why Level.CONFIG?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should have been FINE. I will change that.

LOGGER.log(Level.SEVERE, "Error occurred when starting rest server", e);
throw new VXQueryRuntimeException("Unable to start REST server", e);
}
LOGGER.log(Level.INFO, "Rest server started");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be in the try? For consistency, have the LOGGER messages in the same place for the start and stop functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are throwing a runtime exception in the catch block, this log line won't be reached if server start failed. I agree the log lines should be consistent. I will take stop() method's log line out of the try block for consistency. @prestoncarman are you ok with that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

}
}

private VXQueryConfig loadConfiguration() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the configuration code should come from vxquery-server project. Then we can have all server properties in one place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prestoncarman Yes. I agree with you. As per my understanding, VXQuery server is started by CCDriver class. I think I need to write a new class wrapping CCDriverclass. Is that so?

Currently, starting cluster controllers, stopping them and starting node controllers are done by 3 different classes, CCDriver, NCDriver and VXQueryClusterShutdown. If I'm to start/stop REST server along with them, all of them needs to be taken to one place (At least cluster start/stop). Am I correct?

* limitations under the License.
*/

package org.apache.vxquery.rest.exceptions;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could go in vxquery-core -> org.apache.vxquery.exceptions.


@BeforeClass
public static void setUp() throws Exception {
System.setProperty(Constants.Properties.VXQUERY_PROPERTIES_FILE, "src/test/resources/vxquery.properties");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should move the local hyrack creation here and make the REST service only use the configuration setting you specify.
Take a look at vxquery-xtest as an example of how to set up a test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have a look and update the classes correspondingly. Thanks for pointing this out!


}

private static QueryResponse getQueryResponse(URI uri, String accepts) throws IOException, JAXBException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to test as many of the REST parameters as possible. I think using a simple query "1 + 1" would also work here. You can even try a query with an error, to show it produces the correct message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was in my mind to write more test cases for different queries. I will be able to work on that this week.

private static final Logger LOGGER = Logger.getLogger(RestServerTest.class.getName());

private static final String QUERY_RESULT_ENDPOINT = "/vxquery/query/result/";
private static final String QUERY = "for $x in doc(\"src/test/resources/dblp.xml\")/dblp/proceedings where $x/year=1990 return $x/title";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would change the XML file to be something smaller. You could use our test files from GHCND test. vxquery-xtest/src/test/resources/TestSources/ghcnd/... This would make your code review only few thousand lines instead of 181K. Plus, these files we created.

@@ -0,0 +1,76 @@
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file doesn't need to be committed. I used it locally for handling logs. I will remove it from git.

System.setProperty(Constants.Properties.HYRACKS_CLIENT_IP, ccConfig.clientNetIpAddress);
System.setProperty(Constants.Properties.HYRACKS_CLIENT_PORT, String.valueOf(ccConfig.clientNetPort));

application = new VXQueryApplication();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VXQueryApplication shouldn't have to be instantiated explicitly. The ClusterController should start it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I got the point. I can move on with CAApplication as you suggested.

ccConfig.clusterNetPort = 39001;
ccConfig.httpPort = 39002;
ccConfig.profileDumpPeriod = 10000;
clusterControllerService = new ClusterControllerService(ccConfig);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what version of Hyracks you're on but you should be able to add the VXQueryApplication here either as a 2nd parameter to ClusterControllerService or as part of the CCConfig and then it will be started when you call clusterControllerService.start()

# limitations under the License.
#

vxquery.rest.server.port=8081 No newline at end of file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be integrated with the parameters for the rest of VXQuery so that when you start the VXQueryApplication the REST interface will always have the parameters it needs.

@erandiganepola
Copy link
Contributor Author

I have altered my implementation to match the feedback given above in the PR. Now along with the ClusterController, VXQueryApplication starts. RestServer is started within the VXQueryApplication. Also I have moved the configuration loading part to VXQueryApplication start time as well. I did this by giving the VXQueryApplication as a command line argument through the app-assembler maven plugin.

I'll cover the documentation part during my last two weeks of the timeline (as I proposed in GSoC proposal timeline)

@prestoncarman
Copy link
Contributor

The REST code should still be in its own maven project. See the mailing list for the project hierarchy. Just make the configuration (the port number) for REST come from the various calling application.

@erandiganepola
Copy link
Contributor Author

I got the point. I will move the REST code into its own maven project.

Copy link
Contributor

@prestoncarman prestoncarman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very close. Lets get these minor changes in so we can merge this change this week.

*/
public class QueryRequest {

public QueryRequest(String requestId, String statement) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does a query request need both the requestID and statement at the same time? Would you only pass the query first, then on the second request send the requestId ("resultId").
I think we should create a separate constructor for the initial statement: QueryRequest(String statement). It seems that null is a common argument to requestId.

QueryRequest request = new QueryRequest(null, "delete nodes /a/b//node()");
runTest(request, CONTENT_TYPE_JSON, 400);
runTest(request, CONTENT_TYPE_XML, 400);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice addition of negative tests. What about a request for a bad result Id?

@Test
public void testSimpleQuery001() throws Exception {
QueryRequest request = new QueryRequest(null, "1+1");
request.setShowAbstractSyntaxTree(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about test that do not return the AST, OET, RP, TET?

@@ -0,0 +1,16 @@
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this file as it is no longer needed.

@@ -0,0 +1,243 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test seems like REST only tests and should be in the REST project.

protected static VXQuery vxQuery;

@BeforeClass
public static void setUp() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setup, start, stop, and teardown methods should be moved to the main code in the rest server project. Create a new class called VXQueryLocalCluster. Then used this class here and in the CLI for creating a local cluster.
Here is an example local cluster class: https://github.com/apache/asterixdb/blob/master/asterixdb/asterix-app/src/main/java/org/apache/asterix/api/common/AsterixHyracksIntegrationUtil.java

@erandiganepola
Copy link
Contributor Author

I updated the PR according to the suggestions given above and the VXQuery CLI code is pushed to this PR as well.

Copy link
Contributor

@prestoncarman prestoncarman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the change is almost ready. One thing I did notice is that most of the files need to be formatted to VXQuery's standards. Take a look at http://vxquery.apache.org/development_eclipse_setup.html for more details. Also, I sent you a patch to include in your change.

<artifactId>maven-site-plugin</artifactId>
</plugin>
-->
<!--
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented out code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! I didn't remove that previously because it was there already in the previous CLI code.

VXQuery vxq = new VXQuery(opts);
vxq.execute();
// if -timing argument passed, show the starting and ending times
if (opts.timing) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are loosing functionality here with timing. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That's one thing I mentioned in the email sent to you (4-5 days back). Since we are using 2 requests to execute and fetch results for a given query, we are missing the actual time spent on executing the query. But in the CLI, we have a small chance since we are waiting for the result as soon as we submit the query to be processed. Based on that assumption, I can add the missing timing metrics if you think it is ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember seeing the e-mail. I guess this is a much larger question about the REST workflow. I guess this can be added back when someone needs to do performance testing. Does the code still support repeatedExecution? If so, I think this is needed. If not, then lets remove these CLI arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prestoncarman: Based on the discussion me and @parshimers had few minutes ago, I think the execution time is an important metric. AsterixDB don't have this problem since they are using synchronous query requests while they also support asynchronous version which doesn't measure execution time accurately. In our case, since we are using 2 requests, one to submit query and one to fetch results, we have lost the link between execution times (since we are not waiting for hyracks to finish query execution).

On repeated executions: I have halted implementing that part because, as I understand, repeated executions are used for statistical purposes. Since we have a problem in measuring execution time, repeated executions doesn't mean anything at the moment. What do you think?

I will raise this question on the mailing list as well, so that we can discuss further on this.

printField("Runtime Plan", response.getRuntimePlan());
}

// System.out.println(String.format("Reading results for '%s', result ID: %d", xqFile, response.getResultId()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented out code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

return;
}

System.err.println();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we are adding a lot of output to the CLI. Do you think this helps the user?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I agree with you. I have reduced few lines out of them.

}

JobId jobId = hcc.startJob(spec, EnumSet.of(JobFlag.PROFILE_RUNTIME));
CloseableHttpClient httpClient = HttpClients.custom()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should be formatted using AsterixDB standard. See the website for developer code formatting details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do that today


@Option(name = "-compileonly", usage = "Compile the query and stop.")
private boolean compileOnly;
// Optional (Not supported by REST API) parameters. Only used for creating a local hyracks cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not support these options?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prestoncarman: What I meant by that comment was, if the user is giving ip and port for an already running REST server, we have no control over these parameters then (because the current REST api specs don't support those parameters). Otherwise, if we are creating a local hyracks cluster, we have control over them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then lets remove these options. They should be defined in a server config (either local or on the server).

*
* @author Erandi Ganepola
*/
public class VXQuery {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this class needs a better name. Its confusing to call it VXQuery. Really this is the REST service ( maybe call it RestService). Also, why put it in a package called core? I was a little confused since we have a project called core.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prestoncarman Agree with you. I moved this class to rest.service package. What about VXQueryService for the class name?

@Test
public void testSimpleQuery001() throws Exception {
QueryRequest request = new QueryRequest("1+1");
request.setShowAbstractSyntaxTree(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need test for when setShowAbstractSyntaxTree is false. Same goes for all other settings.

Copy link
Contributor Author

@erandiganepola erandiganepola Aug 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Now I have created a new test case for that in SuccessResponseTest.

#org.apache.vxquery.available_processors=-1

# Number of local node controllers to be created when creating a local hyracks cluster
org.apache.vxquery.local_nc=2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the number of local node controllers can be removed. When running a local cluster, it should always be one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prestoncarman I will change that. I have been keeping this file for reference to show that we can provide system properties to the VXQueryApplication through this file through -config app argument. I will remove this when I finish documenting the above use case.

@@ -0,0 +1,30 @@
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this file is empty. Can it be removed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a properties file is good, but it should be located in vxquery-server. I think this is an issue we will have to address later when we update the vxquery-server to use NCService.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prestoncarman Yes. My idea is also to use this along with vxquery-server CLI. When starting the vxquery-server, users can specify a properties file by adding -config foo/bar.properties to the CLI app arguments (after a --. ex: "-- -restPort 8080 -config foo/bar.properties").


/**
* Submit a {@link QueryRequest} and fecth the resulting {@link QueryResponse}
* Submit a {@link QueryRequest} and fetth the resulting {@link QueryResponse}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo?

Copy link
Contributor

@prestoncarman prestoncarman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very close. While I still think as a community we are not sure how to structure the maven projects, this gets us closer and we can always restructure it later.

VXQuery vxq = new VXQuery(opts);
vxq.execute();
// if -timing argument passed, show the starting and ending times
if (opts.timing) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember seeing the e-mail. I guess this is a much larger question about the REST workflow. I guess this can be added back when someone needs to do performance testing. Does the code still support repeatedExecution? If so, I think this is needed. If not, then lets remove these CLI arguments.


@Option(name = "-compileonly", usage = "Compile the query and stop.")
private boolean compileOnly;
// Optional (Not supported by REST API) parameters. Only used for creating a local hyracks cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then lets remove these options. They should be defined in a server config (either local or on the server).

@@ -0,0 +1,30 @@
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a properties file is good, but it should be located in vxquery-server. I think this is an issue we will have to address later when we update the vxquery-server to use NCService.

- Implemented REST API
- Altered CLI to use REST API
- Migrated XTests to use REST API

details:
- Implemented the REST Server to start through the cluster controller
  application.
- CLI now calls the REST API (remote if given, local one else) to
  execute queries.
- Migrated XTests to use the REST API to run queries related to tests
@asfgit asfgit merged commit f2e5fd9 into apache:master Aug 25, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants