Changes Import from Climsoft Functionalities#9070
Conversation
updated master
updated master
Updated my master
updated my master
updated my master
Updated my master
Updated master
updated my master
|
@Patowhiz I am testing using the Malawi database. It is so nice to have this. I am going to send some data (for Zambia) that could go into Initial. c) I really like it that the metadata is so prominent now. @jkmusyoka we can be checking and also improving the metadata, and then writing the improved metadata back (as well, I hope)? At least the station metadata. So we should include the shape files for Malawi and Zambia (I use gadm - is there better) in the workshop. Then we can add the administrative districts, and also (maybe?) write them back to climsoft. We should check our export dialog - how well does it work for the metadata? We must do this well before the Zambia workshop. For Malawi we could also add a variable giving the type of station. There are 96 stations there, so I suggest a Variable called Rainfall for all rows. Then change it for those that are Main stations - or whatever they want to call them. We have a special right-click feature to edit a single value - which is fine for this task! d) I now returned to the Observation data and there are no no stations, or elements? I had to press Reset and then (of course) connect to the database again. e) For information, if you go into the data selector and type kas you get kasungu straight away. Same for nkh. f) Could you change g) You have lost the most important table to import, namely the unstacked data. I am happy with all the options you have, but including the unstacked data should be the default. I think earlier, there was a checkbox for that. If so, then it should be checked by default. h) In the elements (once I have selected the stations) they now all seem to be there. I thought earlier it just gave the elements that had data at (at least one of) the stations? i) In the options there is a lot to untick, and I suggest it is rare that anyone will want all those. Also it isn't clear that there are other variables that are there anyway. Also, for daily data could the date-time variable be an option. It is confusing. I suggest all the columns to be exported into the long form (which we didn't use by the way!) should be present, and those that are always given are ticked and disabled - so you know what will be there. Then most of the others are unticked, by default. j) Finally @Patowhiz you suggested, in the menu, that we delete the Wizard option. Here is the current menu: If we delete that option, then I suggest we also delete the separator line above the Import from Climsoft. Then there is just one separator, between Import and Export. I note also that Climsoft is currently the last of the importing and the first of the exporting options. Do we keep it like that, or move it to the top, so it is the first of both? |
|
@rdstern in regards to: Item (a), I set initial as the default because the idea was for to be consistent with Climsoft version 4 data flow. I envisaged Climsoft users to first visualize and QC what is in initial table before they move the data to final, from which they can use to produce products. Nevertheless, given your feedback and considering how initial will be irrelevant in the near future, I've set the default to final. Item (b), the error message that you get when importing metadata is due the internal conversion of imported numeric columns to factor. In particular, when converting station_operational column to factor. @N-thony there is a serious bug when you convert a numeric column that has missing values only to factor. You can reproduce this by adding a new column with missing values only, then converting to numeric then to factor. Line 1501 of the data_object_R6.R file produces an empty integer that is added as a metadata attribute. I noticed there is a comment that the addition is a temporary fix to issue of add_columns not retaining attributes of new columns. I couldn't get the PR under which the temporary fix was added, so I'm not sure if putting a validation check to it would be regressing other areas. Could you check on how to fix this bug? @rdstern once the bug is fixed, it will also be solved here (note it's a general bug). Item (d), could you retest it, I couldn't reproduce it on my machine. Item (f), I was trying to distinguish the observation date from the entry date (which surprisingly doesn't exist in the 2 tables). I've now set it to "Select Dates" as suggested. I also preferred "More Options" over "Options" to indicate to users that selections done in the sub dialog are in addition to the main dialog selections. I've changed it to "Options" as suggested. Item (g), I think I removed unstack data because of performance and UX reasons. It can take quite a long time for large data sets. I've now returned it. Do you still prefer this option to be checked by default? Note, when check records is clicked, the rows showed are of the stacked data. Item(h), we both found that to be performance wise inefficient when we conducting training. Querying distinct elements on those 2 tables takes a lot of time and the UX becomes poor because users have to wait every time they reload the dialog or click on the elements receiver. That's why I decided to just displaying all the elements. Note also, there is no harm in selecting all the elements, the observation data returned will just be for the found elements. Do you still want me to revert to the feature of showing the elements found in the 2 tables only? Item(i), agreed. Should we remove the conversion to date entirely, the make date dialog can be used to convert it to date? I think the ideal thing to do is to convert the datetime column into date time R equivalent, do we currently support date time R columns? I can include a checkbox that says convert date time to date, would you prefer that? I have temporarily removed this feature and I highly recommend we not add feature to not confuse users. item(k) I suggest we remove the separator after this PR is merged, to avoid conflicts on frmMain. I don't quite understand what you mean by "I note also that Climsoft is currently the last of the importing and the first of the exporting options. Do we keep it like that, or move it to the top, so it is the first of both?" Last but not least, I'm hoping Climsoft users will get used to Selecting the dates(to filter data) and clicking on the Check Records button(intentionally added) first before clicking on Ok, failure to which they may have to learn how to kill R-Instat using the task manager. |
|
@Patowhiz many thanks for the updates and detailed reply. @jkmusyoka can we both be testing this version together? @Patowhiz you write: I don't quite understand what you mean by "I note also that Climsoft is currently the last of the importing and the first of the exporting options. Do we keep it like that, or move it to the top, so it is the first of both?" Here is the current Climatic > File menu: And you were proposing the wizard be deleted, so that will currently leave Climsoft Import a bit lonely, surrounded by 2 horizontal lines. I was assuming a bit more needs to be changed? |
|
As I have mentioned above, to avoid The Malawi and Zambia database don't have data in the initial. I didn't want to 'pollute' them with data that is not theirs. I'm happy to add if it is supplied. You can add it as well using the current Climsoft version 4 if you want. If you want to play around with a database that we can always discard the changes, I recommend using Like I have mentioned to you before, any data imported will go to initial and you can correct it from there as well. Feel free to use other databases as well if you think that is more sensible. After which you can let me know when to 'reset' them. |
|
@rdstern @jkmusyoka have you managed to review this PR. |
There was a problem hiding this comment.
@Patowhiz I give my impressions, together with problems and suggestions:
a) I had previously used the Malawi cloud database. I found on connecting this time, that it had remembered that connection. Very nice. I only had to remember rthe password.
b) I started by importing the metadata. I still get this error message?
c) Then I returned to the other button and was surprised to see the Element numbers in the data selector. It had moved from the top to the bottom receiver. Could it remain at the top one, with the stations?
d) Thanks for adding the unstacked back in. I found the order of the elements in the unstacked data to be odd. It isn't a problem with just 3, but couldn't we have the same order as the IDs?
e) Then, in the stacked data it is nice that the order of the factor levels is (I think) the same as the ID order. But @jkmusyoka couldn't we take advantage of the IDs neing positive integers to import them as levels 2, 3 and 5? That would mean that whether we specify them as ID or name, we always import both?
f) The data come in as just a date_time variable. @Patowhiz we do have date-time variables in R-Instat, so please could it come - at least as a date-time (D.T) variable when imported?
g) In the above, I also added a date variable. Please could you add the date variable on import. When it is just daily data perhaps you also detect this and omit the date time, but I would also be happy if you always gave both? (Having both the date and the date-time will be fine for hourly or 10 minute data.)
h) The Station is a character variable. Please could it become a factor.
i) I then tried importing from Initial! I know there is nothing there for those station/elements, but it should tell me and not give an error, see below:
I have not checked other options yet, but this will do for now. This is going to be really nice! (Then to check the export too! I am so happy we are doing this now and not during the Zambia workshop!)
|
@Patowhiz is this now ready for testing again? |
|
@rdstern @jkmusyoka This is now ready for review. I have made all the agreed changes. When explaining to users, please emphasise the importance of using the date filter when importing data from multiple stations. Failing to do so might result in R-Instat hanging, requiring users to know how to terminate it. |
|
I have fixed the metadata import issue (as explained in this comment) by not using the data book R functions for converting to factor. I still recommend @N-thony to use my branch This PR can now be merged regardless of whether issue #9086 is fixed or not. |
rdstern
left a comment
There was a problem hiding this comment.
@Patowhiz this is looking very nice. @N-thony over to you to check and merge. I really hope this can be in the forthcoming version.
@Patowhiz later we need to change the Climatic > File menu. Also please could we discuss whether we could add some "inventory" options. For example I'd like perhaps to import some sunshine data. How can I check which stations have any sunshine records?
|
If you want to check if a station has sunshine records, just select the station and then select the sunshine element then click on check records button. The dialog will show you the number of records present. Easy pizzy. |








Fixes #7583
Replaces PR #8229
@rdstern @jkmusyoka this is ready for review.
Note that I have made significant changes to the dialog design. I'm happy to walk you through the dialog features if the design does not clearly reflect their intended use.
Additionally, some Climsoft users have requested a search functionality in the selector, stating that it can be difficult to scroll through and select the needed stations when there are many.
I propose we remove the "Import From Climsoft Wizard" menu as it is no longer needed.
@rdstern, I also suggest we remove the default date column, as it can be misleading when the data is not daily.
This PR was also meant to complement Climsoft desktop PR #733, which includes a dedicated R-Instat dialog for importing QC data into observation initial. Once this is merged, I can demonstrate to both the R-Instat and Climsoft teams how Climsoft desktop data can be QC'd in R-Instat and then sent back to the database.
Below is a screenshot of the new dialog.
