Hey folks, Apologies if this is entirely the wrong place for this, Hoping someone might be able to help, I've been trying a bit of Python with all this spare time. I've done the basics so have a very loose grasp on the syntax. Basically I want to read in a CSV row by row, each row will relate to an imaginary manager and the staff working beneath them. I want break that row out into multiple rows where the amount of staff is not fixed (i.e. one manager might have 15 staff where as one has 1) For example: The top part of this CSV would be an example of an input with the bottom section an example of the desired output. I've been using the openpyxl library to begin, importing an existing workbook and doing general moving stuff between them but I'm struggling with how I would do the loop portion. I can logically talk my way through it but translating that to Python isn't so easy Heres where I've got to in my failed quest. Code: import openpyxl from openpyxl import Workbook import time #user prompt for file name print('Enter file name:') inputfile = input() #loads that workbook book = openpyxl.load_workbook(inputfile) HRsheet = book.active #Create blank workbook for output newbook = Workbook() newsheet = newbook.active #set up column headers for outputfile newsheet['A1']='Manager' newsheet['B1']='Age' newsheet['C1']='Sex' newsheet['D1']='Department' newsheet['E1']='Staff No' #Ask user how many managers there are/ how many time it needs to iterate through rows? print('Number of managers') managerno = input() #Where to start if the xlsx had headings. #print('Does your sheet have headings? Y/N') #headings = input() #if headings == 'Y'or'y': # start = 'A2' #else: # start = 'A1' #Takes values from HRSheet A2 stores them as a variable and puts them into specific cells within newsheet A3 managername = HRsheet['A2'] newsheet['A3']= (managername.value) #Give the file a name print('Give it a name:') outputfile = input() newbook.save(outputfile)#Save Any guidance appreciated!
I'd highly recommend reading Automate the Boring Stuff with Python. The image you posted doesn't seem to work. Can you try to post it again and I'll have a look
Sorry for the double folks, Been plugging away at this today and trying other solutions, I'm getting there slowly (but probably not efficiently) Below I'm reading the first row, after every count it will copy the cell value into a variable, it then copies that variable into a specific cell value of the new worksheet. I think I need a loop prior to this that will then loop rows so effectively 1) Read row 1 - copy every cell value into variable and put that variable in the new worksheet 2) When no more cell values are in row 1 the loop breaks 3) returns to the first loop and row value +1 4) Reads row 2 - '''' "" "" 5) No more rows in file - breaks loop output saves new workbook! Code: import openpyxl from openpyxl import Workbook wb = openpyxl.load_workbook('GradeSample.xlsx') sheet = wb.active newwb = Workbook() newsheet = newwb.active newsheet['A1']='Name' newsheet['B1']='Section' newsheet['C1']='Writing Grade' newsheet['D1']='Speaking Grade' list(sheet.rows)[1] # Get the first row a = 0 for cellObj in list(sheet.rows)[1]: if a == 0: pass Name=(cellObj.value) print(Name) newsheet.cell(row=2, column=1).value = Name if a == 1: pass Age=(cellObj.value) print(Age) newsheet.cell(row=2, column=2).value = Age if a == 2: pass Sex=(cellObj.value) print(Sex) newsheet.cell(row=2, column=3).value = Sex if a == 3: pass Department=(cellObj.value) print(Department) newsheet.cell(row=2, column=4).value = Department a +=1 print('Save file as:') newfile=input() newwb.save(newfile)#Save Am I getting closer...?
Just having a look now. Are you after a finished working solution or would you prefer pointers? EDIT: I've got a highly commented working solution. It's about 50 lines including comments and runs instantly. It should also support if more departments are added in the future
I was more after pointers but if you do now have a solution that would be better, I can then reverse engineer it (or read the comments...) to understand how you got there
OK ill post it in a bit. Be sure to go over that book I sent you further up. Its what I reccomend to anyone doing python
Awesome cheers pal, yeah I've been plugging away at it, I'm going to buy a physical copy though. I just can't seem to absorb from the screen!
Just thought I'd also mention pluralsight is free for april, never actually used any of their stuff but have been recommended it before and just signed up myself. Should be plenty of python stuff on there.
Code: import openpyxl from openpyxl import Workbook sheet = openpyxl.load_workbook("GradeSample.xlsx").active newwb = Workbook() newsheet = newwb.active newsheet["A1"] = "Manager" newsheet["B1"] = "Age" newsheet["C1"] = "Sex" newsheet["D1"] = "Department" newsheet["E1"] = "Staff No" for index, row in enumerate(sheet.iter_rows()): # skip the first row if index <= 1: continue # remove set the value to a set variable and remove any whitespace surroundign the value managersName = row[0].value managersAge = row[1].value managersSex = row[2].value # stop the loop once we hit a blank cell if not managersName: break numberOfStaff = 0 managers = [] for cell in row[3:]: # if the cell isn't blank we increment the total of managers and add the manager to the list if cell.value != None: numberOfStaff += 1 managers.append( ( None, None, None, sheet.cell(row=2, column=cell.column).value, cell.value, ) ) # Add the summary row to the sheet newsheet.append((managersName, managersAge, managersSex, None, numberOfStaff)) # add each manager and department underneath for manager in managers: newsheet.append(manager) print("Save file as:") newfile = input()" newwb.save(newfile + ".xlsx") # Save I was hoping that was going to have syntax highlighting but it seems it only supports HTML and PHP. You'll see the new filename has the extension automatically added.
Thanks mate! I get an error.... --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-1-b7c9ad7abb48> in <module>() 36 None, 37 None, ---> 38 sheet.cell(row=2, column=cell.column).value, 39 cell.value, 40 ) /anaconda3/lib/python3.7/site-packages/openpyxl/worksheet/worksheet.py in cell(self, row, column, value) 293 """ 294 --> 295 if row < 1 or column < 1: 296 raise ValueError("Row or column values must be at least 1") 297 TypeError: '<' not supported between instances of 'str' and 'int'
No blank line at the top of the excel sheet When I commented out the line giving the issues it worked for the first two manager rows.
odd. Here's the exact sheet I'm using. Seems I did leave a row at the top. Which version of python are you using?
So just using Jupiter notebooks, just says Python 3, the error sees to suggest Python 3.7 If I comment out sheet.cell(row=2, column=cell.column).value I get almost the right thing, It however misses the first manager row and the names come under department.
So to fix missing the first one edit the part i use to check for the first row. Can you try it in a normal python terminal.
So changed the check row and that worked ran it here: https://repl.it/languages/python3 and it works fine... It just throws and error in Jupyter Notebooks Really appreciate your help
Hope the code makes sense. I tried to comment it as well as possible and format it with black but there still may be a few things you've not seen before.
It does make sense! just trying to make sure I totally understand what's happening so I can expand on it further I think the difference between Jupyter Notebooks & the online editor is the version of openpyxl they're using