-
Notifications
You must be signed in to change notification settings - Fork 28
VXQuery-180 : RESTful API - Rest server implementation (GSoC 2017) #172
Conversation
|
Is there any documentation that needs to be updated? Such as managing a cluster: http://vxquery.apache.org/user_cluster_installation.html |
|
Can you explain how this project interacts with vxquery-server? Will the server start the REST api? |
|
Hi @prestoncarman! At the moment, users have to setup the cluster separately and give the cluster controller's ip and port in a properties file. But I'm currently going through the AsterixDB's CCApplication class to adapt a similar thing to VXQueryApplication class. Your suggestions on this will greatly help me to come up with a solution. Thank you! |
|
I think the user should have a single command/script to start the Cluster (Hyracks - cluster controller and all data nodes and the REST service). It would be nice if all the setting for the cluster were in a single properties file. I would think that vxquery-server project would be responsible for starting the REST service after the cluster controller has been started. This may require some refactoring. Does this make sense? |
prestoncarman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a few comments. I did not look to closely at the REST API specific code. Ian can give you feedback on REST specific code. Your making great progress.
| <properties> | ||
| <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> | ||
| <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding> | ||
| <hyracks.fullstack.version>0.3.1</hyracks.fullstack.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope this version is temporary until we can move VXQuery to the next Hyracks version. Is that true? It would be nice to be dependent on a single Hyracks version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes of course @prestoncarman. This will be removed when VXQuery moves to next Hyracks version (0.3.1 - because this is the only version having hyracks-http)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After merging #168, this can be removed I guess.
vxquery-rest/pom.xml
Outdated
| <artifactId>apache-vxquery-rest</artifactId> | ||
|
|
||
| <dependencies> | ||
| <dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please sort the dependencies in alphabetical order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prestoncarman sure. I will do that
|
|
||
| public void stop() { | ||
| try { | ||
| LOGGER.log(Level.CONFIG, "Stopping rest server"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why Level.CONFIG?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should have been FINE. I will change that.
| LOGGER.log(Level.SEVERE, "Error occurred when starting rest server", e); | ||
| throw new VXQueryRuntimeException("Unable to start REST server", e); | ||
| } | ||
| LOGGER.log(Level.INFO, "Rest server started"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be in the try? For consistency, have the LOGGER messages in the same place for the start and stop functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are throwing a runtime exception in the catch block, this log line won't be reached if server start failed. I agree the log lines should be consistent. I will take stop() method's log line out of the try block for consistency. @prestoncarman are you ok with that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
| } | ||
| } | ||
|
|
||
| private VXQueryConfig loadConfiguration() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the configuration code should come from vxquery-server project. Then we can have all server properties in one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prestoncarman Yes. I agree with you. As per my understanding, VXQuery server is started by CCDriver class. I think I need to write a new class wrapping CCDriverclass. Is that so?
Currently, starting cluster controllers, stopping them and starting node controllers are done by 3 different classes, CCDriver, NCDriver and VXQueryClusterShutdown. If I'm to start/stop REST server along with them, all of them needs to be taken to one place (At least cluster start/stop). Am I correct?
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.vxquery.rest.exceptions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could go in vxquery-core -> org.apache.vxquery.exceptions.
|
|
||
| @BeforeClass | ||
| public static void setUp() throws Exception { | ||
| System.setProperty(Constants.Properties.VXQUERY_PROPERTIES_FILE, "src/test/resources/vxquery.properties"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should move the local hyrack creation here and make the REST service only use the configuration setting you specify.
Take a look at vxquery-xtest as an example of how to set up a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will have a look and update the classes correspondingly. Thanks for pointing this out!
|
|
||
| } | ||
|
|
||
| private static QueryResponse getQueryResponse(URI uri, String accepts) throws IOException, JAXBException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to test as many of the REST parameters as possible. I think using a simple query "1 + 1" would also work here. You can even try a query with an error, to show it produces the correct message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was in my mind to write more test cases for different queries. I will be able to work on that this week.
| private static final Logger LOGGER = Logger.getLogger(RestServerTest.class.getName()); | ||
|
|
||
| private static final String QUERY_RESULT_ENDPOINT = "/vxquery/query/result/"; | ||
| private static final String QUERY = "for $x in doc(\"src/test/resources/dblp.xml\")/dblp/proceedings where $x/year=1990 return $x/title"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would change the XML file to be something smaller. You could use our test files from GHCND test. vxquery-xtest/src/test/resources/TestSources/ghcnd/... This would make your code review only few thousand lines instead of 181K. Plus, these files we created.
| @@ -0,0 +1,76 @@ | |||
| # | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file doesn't need to be committed. I used it locally for handling logs. I will remove it from git.
| System.setProperty(Constants.Properties.HYRACKS_CLIENT_IP, ccConfig.clientNetIpAddress); | ||
| System.setProperty(Constants.Properties.HYRACKS_CLIENT_PORT, String.valueOf(ccConfig.clientNetPort)); | ||
|
|
||
| application = new VXQueryApplication(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The VXQueryApplication shouldn't have to be instantiated explicitly. The ClusterController should start it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I got the point. I can move on with CAApplication as you suggested.
| ccConfig.clusterNetPort = 39001; | ||
| ccConfig.httpPort = 39002; | ||
| ccConfig.profileDumpPeriod = 10000; | ||
| clusterControllerService = new ClusterControllerService(ccConfig); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what version of Hyracks you're on but you should be able to add the VXQueryApplication here either as a 2nd parameter to ClusterControllerService or as part of the CCConfig and then it will be started when you call clusterControllerService.start()
| # limitations under the License. | ||
| # | ||
|
|
||
| vxquery.rest.server.port=8081 No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be integrated with the parameters for the rest of VXQuery so that when you start the VXQueryApplication the REST interface will always have the parameters it needs.
|
I have altered my implementation to match the feedback given above in the PR. Now along with the ClusterController, VXQueryApplication starts. RestServer is started within the VXQueryApplication. Also I have moved the configuration loading part to VXQueryApplication start time as well. I did this by giving the VXQueryApplication as a command line argument through the app-assembler maven plugin. I'll cover the documentation part during my last two weeks of the timeline (as I proposed in GSoC proposal timeline) |
|
The REST code should still be in its own maven project. See the mailing list for the project hierarchy. Just make the configuration (the port number) for REST come from the various calling application. |
|
I got the point. I will move the REST code into its own maven project. |
prestoncarman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very close. Lets get these minor changes in so we can merge this change this week.
| */ | ||
| public class QueryRequest { | ||
|
|
||
| public QueryRequest(String requestId, String statement) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does a query request need both the requestID and statement at the same time? Would you only pass the query first, then on the second request send the requestId ("resultId").
I think we should create a separate constructor for the initial statement: QueryRequest(String statement). It seems that null is a common argument to requestId.
| QueryRequest request = new QueryRequest(null, "delete nodes /a/b//node()"); | ||
| runTest(request, CONTENT_TYPE_JSON, 400); | ||
| runTest(request, CONTENT_TYPE_XML, 400); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition of negative tests. What about a request for a bad result Id?
| @Test | ||
| public void testSimpleQuery001() throws Exception { | ||
| QueryRequest request = new QueryRequest(null, "1+1"); | ||
| request.setShowAbstractSyntaxTree(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about test that do not return the AST, OET, RP, TET?
| @@ -0,0 +1,16 @@ | |||
| # | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this file as it is no longer needed.
| @@ -0,0 +1,243 @@ | |||
| /* | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These test seems like REST only tests and should be in the REST project.
| protected static VXQuery vxQuery; | ||
|
|
||
| @BeforeClass | ||
| public static void setUp() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setup, start, stop, and teardown methods should be moved to the main code in the rest server project. Create a new class called VXQueryLocalCluster. Then used this class here and in the CLI for creating a local cluster.
Here is an example local cluster class: https://github.com/apache/asterixdb/blob/master/asterixdb/asterix-app/src/main/java/org/apache/asterix/api/common/AsterixHyracksIntegrationUtil.java
|
I updated the PR according to the suggestions given above and the VXQuery CLI code is pushed to this PR as well. |
prestoncarman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the change is almost ready. One thing I did notice is that most of the files need to be formatted to VXQuery's standards. Take a look at http://vxquery.apache.org/development_eclipse_setup.html for more details. Also, I sent you a patch to include in your change.
vxquery-cli/pom.xml
Outdated
| <artifactId>maven-site-plugin</artifactId> | ||
| </plugin> | ||
| --> | ||
| <!-- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented out code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! I didn't remove that previously because it was there already in the previous CLI code.
| VXQuery vxq = new VXQuery(opts); | ||
| vxq.execute(); | ||
| // if -timing argument passed, show the starting and ending times | ||
| if (opts.timing) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we are loosing functionality here with timing. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That's one thing I mentioned in the email sent to you (4-5 days back). Since we are using 2 requests to execute and fetch results for a given query, we are missing the actual time spent on executing the query. But in the CLI, we have a small chance since we are waiting for the result as soon as we submit the query to be processed. Based on that assumption, I can add the missing timing metrics if you think it is ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember seeing the e-mail. I guess this is a much larger question about the REST workflow. I guess this can be added back when someone needs to do performance testing. Does the code still support repeatedExecution? If so, I think this is needed. If not, then lets remove these CLI arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prestoncarman: Based on the discussion me and @parshimers had few minutes ago, I think the execution time is an important metric. AsterixDB don't have this problem since they are using synchronous query requests while they also support asynchronous version which doesn't measure execution time accurately. In our case, since we are using 2 requests, one to submit query and one to fetch results, we have lost the link between execution times (since we are not waiting for hyracks to finish query execution).
On repeated executions: I have halted implementing that part because, as I understand, repeated executions are used for statistical purposes. Since we have a problem in measuring execution time, repeated executions doesn't mean anything at the moment. What do you think?
I will raise this question on the mailing list as well, so that we can discuss further on this.
| printField("Runtime Plan", response.getRuntimePlan()); | ||
| } | ||
|
|
||
| // System.out.println(String.format("Reading results for '%s', result ID: %d", xqFile, response.getResultId())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented out code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
| return; | ||
| } | ||
|
|
||
| System.err.println(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we are adding a lot of output to the CLI. Do you think this helps the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I agree with you. I have reduced few lines out of them.
| } | ||
|
|
||
| JobId jobId = hcc.startJob(spec, EnumSet.of(JobFlag.PROFILE_RUNTIME)); | ||
| CloseableHttpClient httpClient = HttpClients.custom() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should be formatted using AsterixDB standard. See the website for developer code formatting details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do that today
|
|
||
| @Option(name = "-compileonly", usage = "Compile the query and stop.") | ||
| private boolean compileOnly; | ||
| // Optional (Not supported by REST API) parameters. Only used for creating a local hyracks cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not support these options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prestoncarman: What I meant by that comment was, if the user is giving ip and port for an already running REST server, we have no control over these parameters then (because the current REST api specs don't support those parameters). Otherwise, if we are creating a local hyracks cluster, we have control over them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then lets remove these options. They should be defined in a server config (either local or on the server).
| * | ||
| * @author Erandi Ganepola | ||
| */ | ||
| public class VXQuery { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this class needs a better name. Its confusing to call it VXQuery. Really this is the REST service ( maybe call it RestService). Also, why put it in a package called core? I was a little confused since we have a project called core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prestoncarman Agree with you. I moved this class to rest.service package. What about VXQueryService for the class name?
| @Test | ||
| public void testSimpleQuery001() throws Exception { | ||
| QueryRequest request = new QueryRequest("1+1"); | ||
| request.setShowAbstractSyntaxTree(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still need test for when setShowAbstractSyntaxTree is false. Same goes for all other settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Now I have created a new test case for that in SuccessResponseTest.
| #org.apache.vxquery.available_processors=-1 | ||
|
|
||
| # Number of local node controllers to be created when creating a local hyracks cluster | ||
| org.apache.vxquery.local_nc=2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the number of local node controllers can be removed. When running a local cluster, it should always be one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prestoncarman I will change that. I have been keeping this file for reference to show that we can provide system properties to the VXQueryApplication through this file through -config app argument. I will remove this when I finish documenting the above use case.
| @@ -0,0 +1,30 @@ | |||
| # | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this file is empty. Can it be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a properties file is good, but it should be located in vxquery-server. I think this is an issue we will have to address later when we update the vxquery-server to use NCService.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prestoncarman Yes. My idea is also to use this along with vxquery-server CLI. When starting the vxquery-server, users can specify a properties file by adding -config foo/bar.properties to the CLI app arguments (after a --. ex: "-- -restPort 8080 -config foo/bar.properties").
|
|
||
| /** | ||
| * Submit a {@link QueryRequest} and fecth the resulting {@link QueryResponse} | ||
| * Submit a {@link QueryRequest} and fetth the resulting {@link QueryResponse} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo?
prestoncarman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very close. While I still think as a community we are not sure how to structure the maven projects, this gets us closer and we can always restructure it later.
| VXQuery vxq = new VXQuery(opts); | ||
| vxq.execute(); | ||
| // if -timing argument passed, show the starting and ending times | ||
| if (opts.timing) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember seeing the e-mail. I guess this is a much larger question about the REST workflow. I guess this can be added back when someone needs to do performance testing. Does the code still support repeatedExecution? If so, I think this is needed. If not, then lets remove these CLI arguments.
|
|
||
| @Option(name = "-compileonly", usage = "Compile the query and stop.") | ||
| private boolean compileOnly; | ||
| // Optional (Not supported by REST API) parameters. Only used for creating a local hyracks cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then lets remove these options. They should be defined in a server config (either local or on the server).
| @@ -0,0 +1,30 @@ | |||
| # | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a properties file is good, but it should be located in vxquery-server. I think this is an issue we will have to address later when we update the vxquery-server to use NCService.
- Implemented REST API - Altered CLI to use REST API - Migrated XTests to use REST API details: - Implemented the REST Server to start through the cluster controller application. - CLI now calls the REST API (remote if given, local one else) to execute queries. - Migrated XTests to use the REST API to run queries related to tests
This PR contains the REST server implementation for VXQuery-180. It has been written on top of hyracks-http.
I will be adding more test cases along with tests for complex queries in the coming week. Please review my code and provide me with your valuable feedback.
Thank you!